[Python] #1 Python for Data Science: Intro
This is a self note while taking the online course from: LinkedIn Learning: Python for Data Science Essential Training Part 1 by Lillian Pierson
1. Introduction to the Data Professions
-
Data science: systematic study of structure and behavior of data
- past + current ==> predict future
- interestd in why over how
- uncover correlations and causations in data to support decision-making
-
Data engineering: design, construction, and maintenance of data systems
- prefer to design and build IT systems (rather than to analyze data)
- interested in how over why
- design systems to collect, handle, and store big datasets
- build modular, scalable platforms for data processing
-
Data analytics: data products that describe data and how it behaves by data analysis and visualization processes
- use applications to analyze data (without coding)
- deploys analytics technologies (analytics S/W applications)
2. Data Science
- Data Analysis: process for making sense of data
- includes data cleaning, reformatting, and recombining data
- discovers real-life phenomenon data
- Data Science:
- Artificial Intelligence: a machine or application with the capacity to autonomously execute upon predictions it makes from data
- Prediction: predictive modeling (from data science)
- Execution: autonomous response (from engineering)
- Deep learning: set of predictive methodologies that borrows its structure from the Neural Network structures of the brain
- effective for making predictions from big data
- can be used as a decision model with applications to produce deep learning AI
+) Main Python Libraries for Data Science
- Advanced Data Analysis: NumPy, SciPy, pandas
- Data Visualization: Matplotlib, Seaborn
- Machine Learning: scikit-learn, TensorFlow, Keras