1 minute read

This is a self note while taking the online course from: LinkedIn Learning: Python for Data Science Essential Training Part 1 by Lillian Pierson


1. Introduction to the Data Professions

  • Data science: systematic study of structure and behavior of data

    • past + current ==> predict future
    • interestd in why over how
    • uncover correlations and causations in data to support decision-making
  • Data engineering: design, construction, and maintenance of data systems

    • prefer to design and build IT systems (rather than to analyze data)
    • interested in how over why
    • design systems to collect, handle, and store big datasets
    • build modular, scalable platforms for data processing
  • Data analytics: data products that describe data and how it behaves by data analysis and visualization processes

    • use applications to analyze data (without coding)
    • deploys analytics technologies (analytics S/W applications)

2. Data Science

  • Data Analysis: process for making sense of data
    • includes data cleaning, reformatting, and recombining data
    • discovers real-life phenomenon data
  • Data Science:
  • Artificial Intelligence: a machine or application with the capacity to autonomously execute upon predictions it makes from data
    1. Prediction: predictive modeling (from data science)
    2. Execution: autonomous response (from engineering)
  • Deep learning: set of predictive methodologies that borrows its structure from the Neural Network structures of the brain
    • effective for making predictions from big data
    • can be used as a decision model with applications to produce deep learning AI

+) Main Python Libraries for Data Science

  • Advanced Data Analysis: NumPy, SciPy, pandas
  • Data Visualization: Matplotlib, Seaborn
  • Machine Learning: scikit-learn, TensorFlow, Keras