When people ask me "What do you do now?"
I reply "I am making contributions to the Sktime library."
The usual response is : "Oh! You mean Sklearn as in scikit-learn."
Then I say something like "Ermm, Nope. I meant sktime as in scikit time".
Most get confused. Curiosity gives way and they ask "What exactly is sktime and how is it different from sklearn?". Well, well... Here is what you need to know about sktime.
Sktime is simply
A unified toolbox for machine learning tasks that involve time series data. It is built in the Python programming language.
What is Time Series
Time Series is the observation of a single entity or multiple entities which are time-dependent. This sequential measurement of observation over certain time interval differs from cross-sectional data because it takes into consideration the concept of a "sequential time difference".
Photo Credit - Analytics Vidhya
Sktime vs. Sklearn?
Algorithms and tools of Sklearn library are built to handle cross sectional data whilst Sktime library extends sklearn and handles time series data.
How does Sktime extend Sklearn?
Sktime's low-level interface extends the standard scikit-learn API to handle time series and panel data. Currently, the package implements:
- Various state-of-the-art approaches to supervised learning with time series features
- Transformation of time series, including series-to-series transforms (e.g. Fourier transform), series-to-primitives transforms AKA feature extractors, (e.g. mean, variance), sub-divided into fittables (on table) and row-wise applicates
- Pipelining, allowing to chain multiple transformers with a final estimator
- Meta-learning strategies including tuning and ensembling, accepting pipelines as the base estimator Off-shelf composites strategies, such as a fully customisable random forest for time-series classification, with interval segmentation and feature extraction [1]
What have do you do for Sktime
I make contributions!
My most recent contribution has been in the refactoring of existing forecasters , see here. Example of such forecasters include :
- Theta Forecaster
- TransformTargetForecaster
We refactored the forecasters to make them robust and extendable.
Before, it was difficult to extend forecasters. This was because forecasters previously inherited from two major Base classes, (_Sktimeforecaster
, Baseforecaster
) which was quite confusing. We decided to merge the content of the _Sktimeforecaster and the Baseforecaster as one new Baseforecaster.
There was also the case of boilerplates which was heavily repeated in the forecasters methods - such as the fit
, predict
and update
methods. These boilerplates were majorly utility and check functions. Here is the link to a presentation on the refactoring.
I enjoyed working on the refactoring of the forecasters majorly because I got to understand how some of them are implemented. It was also fun because it was a collaborative work. I worked on it with other member sktime contributors. As an open source newbie, I believe this project was best for my abilities because I got to learn new terminologies such as "boilerplate".
Thank you for reading. Looking forward to writing more contents about my open source Journey. Cheers!
Reference
- Alan Turing Institute - turing.ac.uk/research/research-projects/sk…