parallel_apply_over_df¶
-
sitelle.parallel.parallel_apply_over_df(df, func, axis=1, broadcast=False, raw=False, reduce=None, args=(), **kwargs)¶ Iterates over a pandas Dataframe to apply a function on each line, taking advantage of multiple cores. The signature is similar to
pandas.DataFrame.applyParameters: - df (
DataFrame) – The input dataframe - func (callable) – The function to apply to each line (should accept a
Seriesas input) - axis ({0 or 'index', 1 or 'columns'}, default 0) – 0 or ‘index’: apply function to each column1 or ‘columns’: apply function to each row
- broadcast (boolean, default False) – For aggregation functions, return object of same size with values propagated
- raw (boolean, default False) – If False, convert each row or column into a Series. If raw=True the passed function will receive ndarray objects instead. If you are just applying a NumPy reduction function this will achieve much better performance
- reduce (boolean or None, default None) – Try to apply reduction procedures. If the DataFrame is empty, apply will use reduce to determine whether the result should be a Series or a DataFrame. If reduce is None (the default), apply’s return value will be guessed by calling func an empty Series (note: while guessing, exceptions raised by func will be ignored). If reduce is True a Series will always be returned, and if False a DataFrame will always be returned
- modules (tuple of strings) – The modules to be imported so that func works correctly. Example : (‘import numpy as np’,)
- depfuncs (tuple of string) – The functions used by func but defined outside of its body
- args – Additional arguments to be passed to func
- kwargs – Additional keywords arguments to be passed to func
- df (