Home > News content

Top 10 Python libraries to be known in 2019

via:雷锋网     time:2019/7/31 21:27:38     readed:200

We will discuss the following 10 libraries:

  1. TensorFlow

  2. Scikit-Learn

  3. Numpy

  4. Keras

  5. PyTorch

  6. LightGBM

  7. Eli5

  8. SciPy

  9. Theano

  10. Pandas

brief introduction

Python is one of the most popular and widely used programming languages, which has replaced many programming languages in the industry.

Python is popular among developers for many reasons. However, the most important thing is that it has a large number of libraries for users to use.

Python's simplicity has attracted many developers to create new libraries for machine learning. Python has become very popular among machine learning experts because of its large library.

So the first library to be introduced here is TensorFlow.

1.TensorFlow

What is TensorFlow?

If you are currently using python for machine learning projects, you may have heard of this popular open source library, TensorFlow.

The library is developed by Google and the Brain Team, and almost every Google machine learning application uses TensorFlow.

TensorFlow is like a computing library for writing new algorithms involving a large number of tensor operations. Since neural networks can be easily represented as computational graphs, they can be implemented using TensorFlow as a series of tensor operations. In addition, tensor is an n-dimensional matrix representing data.

Characteristics of TensorFlow

1. Rapid Response Structure

With TensorFlow, we can easily visualize each part of the graph, which is not possible with Numpy or SciKit.

2. Flexibility

A very important feature of TensorFlow is that it operates very flexibly. This means that it's modular and allows you to separate out the parts you want to be independent of.

3. Easy to train

For distributed computing, it is easy to train on CPU and GPU.

4. Parallel Neural Network Training

TensorFlow provides pipeline flow. In this sense, you can train multiple neural networks and GPUs, which makes the model very effective on large systems.

5. Large Communities

Needless to say, it was developed by Google, and there is already a huge team of software engineers constantly improving stability.

6. Open source

One of the best features of this machine learning library is that it is open source and can be used by anyone who has access to the Internet.

Where is TensorFlow used?

You use TensorFlow, every day. The applications you use, such as Google Voice Search or Google Photos, are developed using this library.

All the libraries created in TensorFlow are written in C and C, but it has a complex front end and is implemented in python. Your python code will be compiled and executed on the TensorFlow distributed execution engine built with C and C.

In fact, the application of TensorFlow is infinite, and that's the beauty of it.

2. Scikit-Learn

What is Scikit-Learn?

It is a python library associated with NumPy and Scippy. It is considered to be one of the best libraries for dealing with complex data.

A lot of changes have been made in this library. One of the modifications is the cross-validation feature, which provides the ability to use multiple metrics. Many training methods, such as logistics regression and nearest neighbor algorithm, have not been improved.

Characteristics of Scikit-Learn

  1. [计] cross validation There are several ways to check the accuracy of supervised models on invisible data.

  2. Unsupervised learning algorithmSimilarly, there are a lot of algorithms in the product-from clustering, factor analysis, principal component analysis to unsupervised neural networks

  3. feature extractionUsed to extract features from images and text (for example, a paragraph of text)

Where is Scikit Learn used?

It contains many algorithms to implement standard machine learning and data mining tasks, such as dimension reduction, classification, regression, clustering and model selection.

3. Numpy

What is Numpy?

Numpy is considered one of the most popular machine learning libraries in python.

TensorFlow and other libraries internally use Numpy to perform multiple operations on tensor. Array interface is the best and most important feature of Numpy.

Characteristics of Numpy

  1. [医]reciprocity Numpy is very easy to understand and use

  2. Mathematical propertyMake complex mathematical implementations very simple

  3. directly perceived through the sensesReally makes coding easier and concepts easier to grasp

  4. A large number of interfacesWidely used, so there are many open source contributors

Where is Numpy used?

This interface can be used to represent images, sounds, and other binary raw streams as n-dimensional real arrays.

The knowledge of Numpy is very important for stack developers to implement machine learning libraries.

4.Keras

What is Keras?

Keras is considered one of the coolest machine learning libraries in python. It provides a mechanism for easier expression of neural networks. Keras also provides some best utilities for compiling models, processing data sets, visualization of graphics, and so on.

In the back end, Keras uses Theano or TensorFlow internally. Some of the most popular neural networks, such as CNTK, can also be used. When we compare it with other machine learning libraries, Keras is relatively slow because it uses back-end infrastructure to create computational diagrams and then uses it to perform operations. All Keras models are very simple.

Characteristics of Keras

  • It runs smoothly on both CPU and GPU.

  • Keras supports almost all neural network models-full connection, convolution, pooling, cycling, embedding, etc. In addition, these models can be combined to build more complex models.

  • Keras is essentially modular, with incredible expressiveness, flexibility and innovative research capabilities.

  • Keras is a python-based framework that makes debugging and exploration easy.

Where is Keras used?

You're already interacting with products built with Keras -- Netflix, Uber, Yelp, Instacart, Zocdoc, Square and many other companies are using it. It is particularly popular in start-ups, where in-depth learning is at the core of their products.

Keras contains many commonly used neural network building blocks, such as layer, target, activation function, optimizer and a series of tools to make the processing of image and text data easier.

In addition, it also provides many pre-processed data sets and pre-training models, such as MNIST, VGG, Inception, SqueezeNet, ResNet, etc.

Keras is also a favorite of in-depth learning researchers. Researchers from large scientific organizations, especially CERN and NASA, have a particular preference for Keras.

5.PyTorch

What is PyTorch?

PyTorch is the largest machine learning library, which allows developers to perform tensor calculations with the acceleration of GPU, create dynamic diagrams, and automatically calculate gradients. In addition, PyTorch also provides a wealth of API to solve neural network-related application problems.

This machine learning library is based on Torch. It is an open source machine library implemented in C language and encapsulated in Lua.

The machine learning library(python), launched in 2017, has become increasingly popular since its inception and has attracted more machine learning developers.

Characteristics of PyTorch

  • End-to-end Hybrid

A new hybrid front end provides easy to use and flexible Eager Mode, at the same time, seamless transition to graph mode, is very practical in C running environment for speed.

  • Distributed training

Local support for asynchronous execution of collective operations and point-to-point communication (Python and C),) is used to optimize performance in research and production.

  • Python first

PyTorch is not a tool for binding python to the C framework. It is built to integrate deeply into python so that it can be used with popular libraries and packages such as Cython and Numba.

  • Libraries and tools

An active community of researchers and developers has built a rich ecosystem of tools and libraries to extend PyTorch and support development in areas ranging from computer vision to reinforcement learning.

Where is PyTorch used?

PyTorch is mainly used in natural language processing and other fields of applications.

It's mainly developed by Facebook's AI research team. Uber's probabilistic programming software "Pyro" is based on it.

PyTorch is superior to TensorFlow, in many ways. Recently, it has received a lot of attention.

6. LightGBM

What is LightGBM?

Gradient enhancement is one of the best and most popular machine learning (ML) libraries. It can help developers build new algorithms using redefined basic models, namely decision trees. Therefore, there are special libraries to implement this method quickly and effectively.

These libraries include LightGBM, XGBoost and CatBoost. There is a competitive relationship between these libraries, all of which help solve common problems and can be used in almost similar ways.

Characteristics of LightGBM

  • The calculation speed is fast and the production efficiency is high.

  • It is intuitive and easy to use.

  • Train faster than many other deep learning libraries.

  • No errors occur when NaN values and other canonical values are encountered.

Where is LightGBM used?

This library provides highly scalable, optimized and fast gradient enhancement implementations, which makes it popular among machine learning developers. Most machine learning stack developers have won the machine learning contest by using these algorithms.

7.Eli5

What is Eli5?

In general, the prediction results of machine learning model are not accurate, and Eli5, a machine learning library built into python, helps to overcome this challenge. It is a combination of visualization and debugging of all machine learning models and tracking all working steps of the algorithm.

Characteristics of Eli5

In addition, Eli5 supports other libraries, including xgboost,lightning,scikit-learn and sklearn-crfsite. Each of the above libraries can perform different tasks.

Where is Eli5 used?

  • Mathematical applications that require a large number of calculations in a short period of time

  • Eli5 plays a critical role in the context of dependencies with other Python packages

  • New methods of implementing traditional applications in various fields

8. SciPy

What is SciPy?

SciPy is a machine learning library for application developers and engineers. However, you still need to know the difference between the SciPy library and the Scipy stack. The SciPy library contains modules for optimization, linear algebra, integration and statistics.

Characteristics of SciPy

The main feature of the SciPy library is that it was developed using Numpy, and its array takes full advantage of Numpy.

In addition, SciPy uses its specific submodules to provide all valid numerical programs, such as optimization, numerical integration, and many other programs.

All functions in all SciPy sub-modules have specific documentation annotations.

Where is SciPy used?

SciPy is a library that uses Numpy to solve mathematical functions. SciPy uses Numpy arrays as its basic data structure, along with modules for various common tasks in scientific programming.

SciPy can easily handle linear algebra, integral (calculus), ordinary differential equation solving and signal processing tasks.

9.Theano

What is Theano?

Theano is a computing framework machine learning library for computing multidimensional arrays. Its working principle is similar to that of TensorFlow, but it is not as effective as TensorFlow because it can not adapt to the production environment.

In addition, Theano can also be used in distributed or parallel environments similar to TensorFlow.

The Characteristics of Theano

  • Closely integrated with Numpy -- the ability to use complete Numpy arrays in uncompiled functions

  • Efficient use of GPU -- much faster than CPU for data-intensive computing

  • Effective Symbolic Differentiation-Theano derivatives for functions with one or more inputs

  • Speed and stability optimization-even in the case of very small x, the correct answer to log (1 x) can be obtained. This is just an example of Theano stability.

  • Dynamic C Code Generation - Evaluating expressions faster than before, greatly improving efficiency

  • Various types of ambiguities and errors in a wide range of unit testing and self-validation-detection and diagnosis models

Where is Theano used?

The actual grammar of Theano expressions is symbolic, which is inconvenient for beginners accustomed to conventional software development. Specifically, expressions are defined, compiled in an abstract way, and then used directly for computation.

It is specially designed to deal with the computational requirements of large-scale neural network algorithms for in-depth learning. It is one of the earliest libraries of its kind (developed since 2007) and is considered as an industry standard for in-depth learning research and development.

Theano is currently being used in a number of neural network projects, and with the passage of time, the popularity of Theano is also increasing.

20. Pandas

What is Pandas?

Pandas is a machine learning library in Python, which provides advanced data structures and various analysis tools. An important feature of this library is the ability to convert complex data operations using one or two commands. Pandas has many built-in functions for grouping, data combination, filtering, and time series.

Characteristics of Pandas

Pandas ensures that the entire data processing process is easier. Support for operations such as re-indexing, iteration, sorting, aggregation, joining, and visualization is one of the highlights of Pandas.

Where is Pandas used?

Currently, there are fewer versions of Pandas libraries, including hundreds of new features, bug fixes, enhancements, and API changes. Pandas improves by grouping and sorting data, selecting the most appropriate output for the method used, and providing support for performing custom types of operations.

When Pandas is used, data analysis accounts for a large proportion. However, when used with other libraries and tools, Pandas ensures high performance and flexibility.

This is the introduction of the top 10 machine learning libraries in python. I hope this article can help you start learning the libraries available in python.

Via:Https://dzone.com/articles/top-10-python-libraries-you-must-know-in-2019

China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments