Sklearn Python

Posted onby admin

auto-sklearn is an automated machine learning toolkit and a drop-inreplacement for a scikit-learn estimator:

auto-sklearn frees a machine learning user from algorithm selection andhyperparameter tuning. It leverages recent advantages in Bayesianoptimization, meta-learning and ensemble construction. Learn more aboutthe technology behind auto-sklearn by reading our paper published atNIPS 2015.

NEW: Auto-sklearn 2.0

Kite is a free autocomplete for Python developers. Code faster with the Kite plugin for your code editor, featuring Line-of-Code Completions and cloudless processing. There are several Python libraries which provide solid implementations of a range of machine learning algorithms. One of the best known is Scikit-Learn, a package that provides efficient versions of a large number of common algorithms.

Auto-sklearn 2.0 includes latest research on automatically configuring the AutoML system itselfand contains a multitude of improvements which speed up the fitting the AutoML system.

Python

auto-sklearn 2.0 works the same way as regular auto-sklearn and you can use it via

A paper describing our advances is available on arXiv.

Example¶

This will run for one hour and should result in an accuracy above 0.98.

Manual¶

License¶

auto-sklearn is licensed the same way as scikit-learn,namely the 3-clause BSD license.

Citing auto-sklearn¶

If you use auto-sklearn in a scientific publication, we would appreciate areference to the following paper:

Efficient and Robust Automated Machine Learning,Feurer et al., Advances in Neural Information Processing Systems 28 (NIPS 2015).

Bibtex entry:

Sklearn python 3.8

If you are using Auto-sklearn 2.0, please also cite

Auto-Sklearn 2.0: The Next Generation, Feurer et al., (arXiv, 2020).

Bibtex entry:

Contributing¶

We appreciate all contribution to auto-sklearn, from bug reports anddocumentation to new features. If you want to contribute to the code, you canpick an issue from the issue trackerwhich is marked with Needs contributer.

Note

To avoid spending time on duplicate work or features that are unlikely toget merged, it is highly advised that you contact the developersby opening a github issue before starting to work.

When developing new features, please create a new branch from the developmentbranch. When to submitting a pull request, make sure that all tests arestill passing.

There are different ways to install scikit-learn:

  • Install the latest official release. Thisis the best approach for most users. It will provide a stable versionand pre-built packages are available for most platforms.

  • Install the version of scikit-learn provided by youroperating system or Python distribution.This is a quick option for those who have operating systems or Pythondistributions that distribute scikit-learn.It might not provide the latest release version.

  • Building the package from source. This is best for users who want thelatest-and-greatest features and aren’t afraid of runningbrand-new code. This is also needed for users who wish to contribute to theproject.

Installing the latest release¶

Operating SystemSklearn

Scikit Learn Install


Packager
Install the 64bit version of Python 3, for instance from https://www.python.org.Install Python 3 using homebrew (brew install python) or by manually installing the package from https://www.python.org.Install python3 and python3-pip using the package manager of the Linux Distribution.Install conda using the Anaconda or miniconda installers or the miniforge installers (no administrator permission required for any of those).

Then run:

In order to check your installation you can use

Note that in order to avoid potential conflicts with other packages it isstrongly recommended to use a virtual environment (venv) or a conda environment.

Using such an isolated environment makes it possible to install a specificversion of scikit-learn with pip or conda and its dependencies independently ofany previously installed Python packages. In particular under Linux is itdiscouraged to install pip packages alongside the packages managed by thepackage manager of the distribution (apt, dnf, pacman…).

Note that you should always remember to activate the environment of your choiceprior to running any Python command whenever you start a new terminal session.

If you have not installed NumPy or SciPy yet, you can also install these usingconda or pip. When using pip, please ensure that binary wheels are used,and NumPy and SciPy are not recompiled from source, which can happen when usingparticular configurations of operating system and hardware (such as Linux ona Raspberry Pi).

Scikit-learn plotting capabilities (i.e., functions start with “plot_”and classes end with “Display”) require Matplotlib. The examples requireMatplotlib and some examples require scikit-image, pandas, or seaborn. Theminimum version of Scikit-learn dependencies are listed below along with itspurpose.

Dependency

Minimum Version

Purpose

numpy

1.13.3

build, install

scipy

0.19.1

build, install

joblib

0.11

install

threadpoolctl

2.0.0

install

cython

0.28.5

build

matplotlib

2.1.1

benchmark, docs, examples, tests

scikit-image

0.13

docs, examples, tests

pandas

0.25.0

benchmark, docs, examples, tests

seaborn

0.9.0

docs, examples

memory_profiler

0.57.0

benchmark, docs

pytest

5.0.1

tests

pytest-cov

2.9.0

tests

flake8

3.8.2

tests

mypy

0.770

tests

pyamg

4.0.0

tests

sphinx

3.2.0

docs

sphinx-gallery

0.7.0

docs

numpydoc

1.0.0

docs

Pillow

7.1.2

docs

sphinx-prompt

1.3.0

docs

Warning

Scikit-learn 0.20 was the last version to support Python 2.7 and Python 3.4.Scikit-learn 0.21 supported Python 3.5-3.7.Scikit-learn 0.22 supported Python 3.5-3.8.Scikit-learn now requires Python 3.6 or newer.

Note

For installing on PyPy, PyPy3-v5.10+, Numpy 1.14.0+, and scipy 1.1.0+are required.

Installing on Apple Silicon M1 hardware¶

The recently introduced macos/arm64 platform (sometimes also known asmacos/aarch64) requires the open source community to upgrade the buildconfiguation and automation to properly support it.

At the time of writing (January 2021), the only way to get a workinginstallation of scikit-learn on this hardware is to install scikit-learn and itsdependencies from the conda-forge distribution, for instance using the miniforgeinstallers:

The following issue tracks progress on making it possible to installscikit-learn from PyPI with pip:

Third party distributions of scikit-learn¶

Some third-party distributions provide versions ofscikit-learn integrated with their package-management systems.

These can make installation and upgrading much easier for users sincethe integration includes the ability to automatically installdependencies (numpy, scipy) that scikit-learn requires.

The following is an incomplete list of OS and python distributionsthat provide their own version of scikit-learn.

Arch Linux¶

Arch Linux’s package is provided through the official repositories aspython-scikit-learn for Python.It can be installed by typing the following command:

Python Sklearn Pipeline

Debian/Ubuntu¶

The Debian/Ubuntu package is splitted in three different packages calledpython3-sklearn (python modules), python3-sklearn-lib (low-levelimplementations and bindings), python3-sklearn-doc (documentation).Only the Python 3 version is available in the Debian Buster (the more recentDebian distribution).Packages can be installed using apt-get:

Fedora¶

The Fedora package is called python3-scikit-learn for the python 3 version,the only one available in Fedora30.It can be installed using dnf:

NetBSD¶

scikit-learn is available via pkgsrc-wip:

MacPorts for Mac OSX¶

The MacPorts package is named py<XY>-scikits-learn,where XY denotes the Python version.It can be installed by typing the followingcommand:

Anaconda and Enthought Deployment Manager for all supported platforms¶

Anaconda andEnthought Deployment Managerboth ship with scikit-learn in addition to a large set of scientificpython library for Windows, Mac OSX and Linux.

Anaconda offers scikit-learn as part of its free distribution.

Intel conda channel¶

Intel maintains a dedicated conda channel that ships scikit-learn:

This version of scikit-learn comes with alternative solvers for some commonestimators. Those solvers come from the DAAL C++ library and are optimized formulti-core Intel CPUs.

Note that those solvers are not enabled by default, please refer to thedaal4py documentationfor more details.

Compatibility with the standard scikit-learn solvers is checked by running thefull scikit-learn test suite via automated continuous integration as reportedon https://github.com/IntelPython/daal4py.

WinPython for Windows¶

The WinPython project distributesscikit-learn as an additional plugin.

Troubleshooting¶

Error caused by file path length limit on Windows¶

It can happen that pip fails to install packages when reaching the default pathsize limit of Windows if Python is installed in a nested location such as theAppData folder structure under the user home directory, for instance:

In this case it is possible to lift that limit in the Windows registry byusing the regedit tool:

  1. Type “regedit” in the Windows start menu to launch regedit.

  2. Go to theComputerHKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlFileSystemkey.

  3. Edit the value of the LongPathsEnabled property of that key and setit to 1.

  4. Reinstall scikit-learn (ignoring the previous broken installation):