Data Computing and Analysis in Python

Background

Nowdays, Python becomes as an universial language due to its simple, readable, multifunctional and powerful.

In the scientific computing and data analysis scopes, Python is also widely used with high efficency modules.

One agreeable is called SciPy stack, a collection of open source software for scientific computing in Python.

This stack provide the essential for the purpose, the ecosystem is shown below,

pydata-ecosystem

Some IT developers may doubt that it cannot provide good performance in Python due to its relatively slow speed.

For this, some Python data scientists used C language also wrote in the best and quickest form.

It leads the operation speed of these packages provided is much faster than using plain Python or even C.

As a result, this packages toolstack becomes the standard in these growing young IT fields.

Packages

The ecosystem contains the layer-by-layer relationship between the packages, they work with each other.

At the following, we will pick the most remarkable for each layer. Let’s lookup from human view.

Integrated Development Environment

  • Spyder: Community-developed, provides excellent support to scientific computing

Interactive Shell

  • IPython: Advanced Python shell with strong development-support features

Visualization

  • Matplotlib: Mature and popular 2D and 3D chart plotting library to images

Dataframe

  • pandas: Providing high-performance, easy-to-use data structures
  • SciPy(library): Performing scientific algorithms based on NumPy matrices

Dataclass

  • NumPy: As base datatype of all modules in stack for numerical computation

Interpreter

  • CPython: Implemented by C, most well-known and supporting, as default setting

Comments