Name: Data Scientist Masters Program
Price: 50% INR
Author: IT GURU

Program Syllabus

Tableau Online Training

ITGuru Tableau Online Training helps you to know the simplification of raw data in an easily understandable format, b...

Preview
Course Syllabus
- Introduction to Data Visualization and Power of Tableau
  - What is data visualization, Comparison and benefits against reading raw numbers, Real usage examples from various business domains
  - Some quick powerful examples using Tableau without going into the technical details of Tableau, installing Tableau, Tableau interface
  - Connecting to DataSource, Tableau Data Types, data preparation
- Architecture of Tableau
  - Installation of Tableau Desktop, Architecture of Tableau, Interface of Tableau (Layout, Toolbars, Data Pane, Analytics Pane etc)
  - How to start with Tableau, Ways to share and exporting the work done in Tableau
- Working with Metadata & Data Blending
  - Connection to Excels, PDFs and Cubes, Managing Metadata and Extracts, Data Preparation and dealing with NULL values
  - Data Joins (Inner, Left, Right, Outer) and Union, Cross Database joining, Data Blending, data extraction, refresh extraction, incremental extraction
  - How to build extract
- Creation of sets
  - Marks, Highlighting, Sort and Group, Working with Sets (Creation of sets, Editing sets, IN/OUT, Sets in Hierarchies), constant sets, computed Sets, bins
- Working with Filters
  - Filters (Addition and Removal), Filtering continuous dates, dimensions, measures, Interactive Filters, marks card, hierarchies
  - How to create folders in Tableau, sorting in Tableau, types of sorting, filtering in Tableau, types of filters, filtering order of operations
- Organizing Data and Visual Analytics
  - Formatting Data (Labels, Annotations, Tooltips, Edit axes), Formatting Pane (Menu, Settings, Font, Alignment, Copy-Paste)
  - Trend and Reference Lines, Forecasting, k-means Cluster Analysis in Tableau, visual analytics in Tableau, reference lines and bands, confidence interval
- Working with Calculations & Expressions
  - Calculation Syntax and Functions in Tableau, Types of Calculations (Table, String, Logic, Date, Number, Aggregate)
  - LOD Expressions (concept and syntax), Aggregation and Replication with LOD Expressions
  - Nested LOD Expressions, Level of Details, Fixed Level of Details, Lower Level of Details, Higher Level of Details Quick Table Calculations
  - How to create Calculated Fields, predefined Calculations, how to validate
- Working with Parameters
  - Create Parameters, Parameters in Calculations, Using Parameters with Filters, Column Selection Parameters, Chart Selection Parameters
  - How to use Parameters in Filter Session
  - How to use parameters in Calculated Fields, how to use parameters in Reference Line
- Charts and Graphs
  - Dual Axes Graphs, Histogram (Single and Dual Axes), Box Plot, Pareto Chart, Motion Chart, Funnel Chart, Waterfall Chart, Tree Map, Heat Map, Market Basket analysis, Using Show me
  - Types of Charts, Text Table, Heat map, Highlighted Table, Pie Chart, Tree map, Bar chart, Line Chart, Bubble Chart, Bullet chart, Scatter Chart
  - Dual Axis Graphs, Funnel Charts, Pareto Chart, Maps, Hands on Lab, Assignment, Funnel Chart, Waterfall Chart, Maps
- Dashboards and Stories
  - Build and Format a Dashboard (Size, Views, Objects, Legends and Filters), Best Practices for Creative and Interactive Dashboards using Actions
  - Create Stories (Intro of Story Points, Creating and Updating Story Points, Adding Visuals in Stories, Annotations with Description)
  - DashBoards& Stories, what is Dashboard, Filter Actions, Highlight Actions, UrlActions , Selecting & Clearing values, DashBoardExamples
  - Best Practices in Creating DashBoards, Tableau WorkSpace, Tableau Interface, Tableau Joins
  - Types of Joins, Live vs Extract Connection, Tableau Field Types, Saving and Publishing Data Source, File Types
Data Science Course

The Data Science Online Training at IT Guru will provide you the best knowledge on Data Science basics, data analysis...

Preview
Course Syllabus
- Module 1: Introduction to DataScience
  - What is Data Science?
  - Why Python for data science?
  - Relevance in industry and need of the hour
  - How leading companies are harnessing the power of Data Science with Python?
  - Different phases of a typical Analytics/Data Science projects and role of python
  - Anaconda vs. Python
- Module 2: Python Essentials (Core)
  - Python Datatypes Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
  - Functions
  - Exceptions
  - Decarators
  - Classes and Inheritance
  - Multithreading
  - Python with Databases (PostgresSQL, MySQL)
- Module 3: Accessing / Importing and Exporting Data using Python Modules
  - Importing Data from various sources (Csv, txt, excel, access etc)
  - Database Input (Connecting to database)
  - Viewing Data objects - subsetting, methods
  - Exporting Data to various formats
  - Important python modules: Pandas, beautifulsoup
- Module 4: Data Analysis and Visualization using Python
  - Introduction exploratory data analysis
  - Descriptive statistics, Frequency Tables and summarization
  - Univariate Analysis (Distribution of data & Graphical Analysis)
  - Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  - Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
  - Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and scipy.stats etc)
  - Libraries we focus under module 4
  - Numpy - Numerical library
  - a) ND array
  - b) Subset, slicing
  - c) Indexing
  - d) List vs ND array
  - e) Manipulating arrays
  - f) Mathematical operations and apply functions
  - g) Linear algebra operations
  - Scipy – Scientific Lirary
  - Pandas - Data Analysis library
  - a) Data loading
  - b) Series and Data frame
  - c) Selecting rows and columns
  - d) Position and label-based indexing
  - e) Slicing and dicing
  - f) Merging and concatenating
  - g) Grouping and summarizing
  - h) Data Processing, cleaning
  - i) Missing Values
  - j) Outliers
  - Matplotlib – Basic 2D Data Visualization library
  - a) Introduction to Matplotlib Basic plotting Figures and sub plotting
  - Box plot, Histograms, Scatter plots, image loading
  - b) Introduction to Seaborn
  - Histogram, rugged plot, hex plot and density plot
  - Joint plot, pair plot, count plot, Heat maps
  - c) Plotting categorical data and aggregation
  - Seaborn – Advanced Data Visualization library
  - Stat – Stastics library
- Module 5: Statistics & MathMatics
  - Types of data
  - Levels of measurement
  - Categorical variables. Visualization techniques for categorical variables
  - Numerical variables. Using a frequency distribution table
  - Histogram charts
  - Cross tables and scatter plots
  - Measures of central tendency
  - The main measures of central tendency: mean, median and mode
  - Measuring skewness
  - Measuring how data is spread out: calculating variance
  - Standard deviation and coefficient of variation
  - Calculating and understanding covariance
  - The correlation coefficient
  - Basic Statistics - Measures of Central Tendencies and Variance
  - Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem
  - Inferential Statistics -Sampling - Concept of Hypothesis Testing
  - Statistical Methods - Z/t-tests (One sample, independent, paired), Anova, Correlation and Chi-square
  - Important modules for statistical methods: Numpy, Scipy, Pandas
- Module 6: Machine Learning – Predictive Modelling – Basics
  - Introduction to Machine Learning & Predictive Modeling
  - Types of Business problems - Mapping of Techniques - Regression vs. classification vs. segmentation vs. Forecasting
  - Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  - Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
  - Overfitting (Bias-Variance Trade off) & Performance Metrics
  - Feature engineering & dimension reduction
  - Concept of optimization & cost function
  - Concept of gradient descent algorithm
  - Concept of Cross validation(Bootstrapping, K-Fold validation etc)
  - Model performance metrics (R-square, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )
- Module 7: Machine Learning Algorithms & Applications – Implementation in Python
  - Linear & Logistic Regression
  - Segmentation - Cluster Analysis (K-Means)
  - Decision Trees (CART/CD 5.0)
  - Ensemble Learning (Random Forest, Bagging & boosting)
  - Artificial Neural Networks(ANN)
  - Support Vector Machines(SVM)
  - Other Techniques (KNN, Naïve Bayes, PCA)
  - Introduction to Text Mining using NLTK
  - Introduction to Time Series Forecasting (Decomposition & ARIMA)
  - Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)
  - Fine tuning the models using Hyper parameters, grid search, piping etc.
- Machine Learning Case Studies
  - Market Basket Analysis
  - Dimensionality reduction on CTG
  - Email filtering – spam or not spamd
  - Product recommendations
  - Fraud detection
  - Breast cancer diagnostic detection
  - House price prediction analysis
  - Predicting wine quality
Machine Learning Course

The Machine Learning Online Training at IT Guru will provide you the best knowledge on Machine learning basics, algor...

Preview
Course Syllabus
Artificial Intelligence Online Course

The Artificial Intelligence Training at IT Guru will provide you the best knowledge on AI basics, intelligent machine...

Preview
Course Syllabus
- INTRODUCTION TO DATA SCIENCE
  - What is data Science? – Introduction
  - Importance of Data Science
  - Demand for Data Science Professional
  - Life cycle of data science
  - Tools and Technologies used in data Science
  - Business Intelligence vs Data Science vs Data Engineer
  - Role of a data scientist
- PART A – INTRODUCTION TO STATISTICS
  - Fundamentals of Math and Probability
  - Basic understanding of linear algebra, Matrices, vectors
  - Basics of Calculus
  - Various types and functions of matrices
  - Eigen vectors and Eigenvalues of a Matrix
  - Fundamentals of Probability
  - Types of events in Probability
  - Permutations & Combinations
  - Associative, Commutative and Distributive Laws
  - Descriptive Statistics
  - Describe or summaries a set of data Measure of central tendency and measure of dispersion.
  - The mean, median, mode, Standard deviation, Variance, Range, kurtosis and skewness.
  - Histograms, Bar chart, Box plot
  - Inferential Statistics
  - What is inferential statistics Different types of Sampling techniques
  - Random variable
  - Probability Distribution and Cumulative Probability Distribution
  - Binomial Distribution & Quincunx
  - Normal Distribution & Normal variable
  - Sample Vs Population summary metrics
  - Point estimate and Interval estimate
  - Creating confidence interval for population parameter using Z* score and confidence level percentage
  - Bias & Variance trade-offs
  - Hypothesis Testing
  - Hypothesis Testing Basics
  - Null Hypothesis
  - Alternate Hypothesis
  - p-Value
  - False Positive & False Negative
  - Types of errors-Type 1 Errors, Type 2 Errors P value method, Z score Method
  - T-Test, Analysis of variance(ANOVA)
  - Exploratory Data Analysis
  - Introduction to EDA
  - Data Sourcing & Data cleaning
  - Fixing rows, columns
  - Missing values treatment and invalid values
  - Standardize values and filter data
  - Outliers treatment
  - Types of variables
  - Univariate Analysis on Unordered, ordered and quantitative variables
  - Rank-Frequency and Power Law distribution
  - Bivariate Analysis
  - Correlation
  - Various types of Derived metrics
- PART B – UNDERSTANDING AND IMPLEMENTING
  - Introduction to Machine Learning
  - What is Machine Learning?
  - Introduction to Supervised Learning, Unsupervised Learning & Semi-supervised Learning
  - What is Reinforcement Learning?
  - Variable Identification
  - CRISP-DM framework
  - Linear Regression
  - Introduction to Linear Regression and simple linear regression
  - Cost function, R-Square, RMSE and best fit line
  - Closed form and Gradient descent
  - Linear Regression with Multiple Variables
  - Disadvantage of Linear Models Interpretation of Model Outputs Understanding
  - Multi-collinearity
  - Adjusted R-Square, P-Value and VIF
  - Missing values & Outlier treatment
  - Understanding Heteroscedasticity
  - Signature of overfitting
  - Case Study
  - Application of Linear Regression for CTG data
  - Logistic Regression
  - Introduction to Logistic Regression
  - Binary Logistic Regression
  - Sigmoid function & Log of odds
  - Threshold Value
  - Multinomial Logistic Regression
  - Introduce the notion of classification Cost function for logistic regression
  - Application of logistic regression to multi-class classification.
  - Confusion Matrix, ROC Curve
  - AIC & BIC
  - Advantages and Disadvantages of Logistic Regression
  - Decision Trees & Random Forest
  - Decision Tree – C4.5, CART
  - How to build decision tree? Understanding CART Model Classification Rules
  - Overfitting Problem Stopping Criteria And Pruning
  - Under fitting
  - Gini Index
  - Entropy & Information Gain
  - MDS
  - How to find final size of Trees? Model A decision Tree.
  - Introduction to Random Forests
  - Ensembles & Bagging technique
  - Out of Bag error
  - Advantages of Random Forest over Decision Trees
  - Support Vector Machines
  - Introduction to SVM
  - Hyperplane & Linear discriminator
  - Maximal Marginal Hyperplane & Support vectors
  - Support Vector classifier
  - Slack variable
  - Boundary & Feature transformation
  - Kernel Trick
  - Handling non-linearity in the dataset using various Kernels
  - Case Study
  - Business Case Study with Cardio to co-graphic data
  - Unsupervised Learning
  - Feature Selection & Feature Extraction
  - Feature Construction
  - Hierarchical Clustering
  - K-Means algorithm for clustering – groupings of unlabeled data points.
  - Principal Component Analysis(PCA)
  - Association Rules
  - Case Study
  - Market Basket Analysis
  - Dimensionality reduction on CTG
- PART C – PYTHON PROGRAMMING
  - Python Introduction
  - Python background, features
  - Installation and Various Python IDEs
  - Python vs Other languages
  - Basics
  - Operators in Python – Arithmetic, Relational, Logical and Assignment Operators
  - Variables, Types Of Variables
  - Naming conventions
  - String operations
  - Data Structures
  - Lists
  - Tuples
  - Sets
  - Dictionaries
  - Comprehensions
  - Python for Data Science
  - Numerical Python
  - ND array
  - Subset, slicing
  - Indexing
  - List vs ND array
  - Manipulating arrays
  - Mathematical operations and apply functions
  - Linear algebra operations
  - Pandas
  - Data loading
  - Series and Data frame
  - Selecting rows and columns
  - Position and label-based indexing
  - Slicing and dicing
  - Merging and concatenating
  - Grouping and summarizing
  - Lambda functions and pivot tables
  - Data Processing, cleaning
  - Missing Values
  - Outliers
  - Data visualization
  - Introduction to Matplotlib
  - Basic plotting
  - Figures and sub plotting
  - Box plot, Histograms, Scatter plots, image loading
  - Introduction to Sea born
  - Histogram, rugged plot, hex plot and density plot
  - Joint plot, pair plot, count plot, Heat maps
  - Plotting categorical data and aggregation of values
  - Plotting Time-Series data using tsplot
- BIG DATA
  - Understanding Big Data and Hadoop
  - What is Data?
  - Different types of Data
  - What is Big Data and the purpose? Where dowe use it?
  - Various Big Data technologies Why Hadoop?
  - Hadoop Eco system Rack awareness
  - HDFS Architecture
  - Hadoop 1.xvs 2.x
  - HDFS Cluster architecture
  - Resource management and configuration files
  - Slaves and master
  - Data loading techniques
  - Map Reduce
  - MapReduce Paradigm
  - Advantages of MapReduce
  - Architecture and various components of Map Reduce
  - YARN and workflow
  - Data orchestration in Map Reduce job flow
  - Combiners, Practitioners
  - Advanced Map Reduce
  - Joins
  - Various Data Types
  - Input formats
  - Output formats
  - MRUnit testing framework
  - Counters
  - Distributed Cache
  - Sequence File
  - Pig
  - What is Pig and why is it required?
  - Pig vs MapReduce
  - Pig components and structure
  - Data Types
  - Data structures
  - Pig limitations
  - Pig Latin
  - Operators
  - Functions
  - Hive
  - What is Hive and where to use
  - Pig Vs Hive
  - Hive architecture and components
  - Data Types and data models
  - Partitions, buckets
  - Data loading
  - Hive QL
  - UDF
  - Index
  - Views
  - Joins
  - Partitioning
  - H Base
  - Introduction to No SQL Database
  - H Base storage architecture
  - H Base components
  - Regions
  - Client
  - Modes
  - Ports and utilities
  - Attributes
  - Data Model
  - Data Loading
  - H Base API
  - Zookeeper and H Base
  - Sqoop and Flume
  - Understand Data Ingestion
  - Introduction to Sqoop and Flume
  - Sqoop: Import from RDBMS to HDFS
  - Sqoop: Import from RDBMS to Hive
  - Sqoop Jobs
  - Flume architecture
  - Flume agent
  - Flume sinks
  - Flume channels
  - Executing the commands
  - Flume multi agent
  - Kafka
  - Introduction to Kafka
  - Kafka Producer
  - Kafka Consumer
  - Internals
  - Cluster membership and controller
  - Replication
  - Request processing
  - Data Storage
  - File processing
  - Compaction
  - Broker
  - Cluster architecture
  - Monitoring
  - Oozie
  - Overview
  - Workflow
  - Scheduling in Oozie
  - Configuration files
  - Monitoring and Coordinator
  - Time and Data triggers
  - Oozie console
  - Spark
  - Spark Overview and architecture
  - Spark Shell
  - Spark context
  - RDDs (Resilient Distributed Datasets) RDD
  - Operations
  - Partitioning
  - Transformations Actions
  - Key-value pair
  - Persistence
  - Spark streaming
  - Spark DStreams
  - Transformations
  - Request count
  - Spark SQL
  - Structured data processing
  - Spark with JSON & XML
  - Data frame operations
  - Working with CSV files & JDBC
  - Broadcast Variables
  - Accumulators
  - MapReduce vs Spark
  - Scala
  - Overview and background
  - Scala vs other languages
  - Environment setup
  - Scala compiler
  - Immutability
  - Variables and Various operators
  - Conditional statements and Loops
  - Lists, Tuples, Maps and options
  - Comprehensions
  - Functional programming in Scala
  - Object Oriented Programming
Deep Learning Course

The Deep Learning Training at IT Guru will provide you the best knowledge on deep learning fundamentals, neural netwo...

Preview
Course Syllabus
TensorFlow Training

The TensorFlow Training at IT Guru will provide you the best knowledge on the variables, Tensors, ML models, etc with...

Preview
Course Syllabus

FAQ's

What is OnlineITGuru’s Data Scientists master’s program and how is it different from other individual courses of ITGuru?
The Data Scientists Master’s program is an organized learning path designed by industry experts. This program ensures you become a professional Data Scientist. Individual courses only focus on one or two specialization or skills but the Master’s program is different. It includes various skills and specializations that can turn you into Master Data Science. This path will take you to explore the world of Data Science.
What are the prerequisites to enroll in the Data Scientist Master’s Program?
There are no such pre-requirements to enroll in the Master’s program here. It doesn’t need whether you are a professional with IT industry experience or a Data Science aspirant. This Master’s Program is designed to assist people irrespective of professional background.
Why should I enroll in the Master’s Program in Data Science?
Data Scientists Master’s Program has been designed with overall research and suggestions made by the industry experts. This learning will help you to master in Data Science, Machine Learning, Tableau, AI, Deep Learning, and TensorFlow concepts. Moreover, you will get hands-on experience with various tools and techniques in this program. IT Gurus will guide you throughout the program.
What are the different modes of OnlineITGuru Master’s Program Training?
ITGuru’s Data Science Master’s Program is compiled with Self-paced and Instructor-led online training courses. Learners have the choice to get expert guidance training as well as self-paced learning.
What is the duration to become a professional Data Scientist?
The duration of this program recommended is 32/34 weeks. Hence it depends upon the individual learners to complete this program at their own pace.
If I enroll today, when shall I get the Master’s Program course access?
As soon as you enroll in the program, you will get access to all the courses within it. The access to these courses will be of a lifetime for every learner who enrolls for our Master’s Program.
What are the projects included in the Data Scientist Master’s Program?
OnlineITGuru offers you the best, updated, relevant, and real-time projects as a part of this training program. It will help you to apply your skills learned in the online training program with intelligence. The program includes multiple projects that test your skills and IQ, theoretical and practical knowledge to make you industry-ready. There will be exciting projects to work in different domains for you. Upon completion, you will have good industry experience.
Is there any specific order to complete the Master’s program courses?
There is no imposition of any specific order to learn this program. Its students/learners preference to choose & complete their courses in any order they like to learn.
Do OnlineITGuru provide job assistance?
Yes, OnlineITGuru will help you to get a good placement by assistance. You will get job interview sessions, and resume building assistance also. IT Guru has a good network with a placement assistance team globally and will assist you to get placed upon completion of the program.
Is it possible to switch from a self-paced program to instructor-led training in the Master’s program?
Yes, it is possible to switch from self-paced training to the instructor-led program by simply paying some extra amount. The batch details will be notified to you as you switch/enroll.
How to pay the training fee for the Online Data Scientist Master’s Program?
IT Guru platform allows you to pay the training fee for the Master’s Program with a secure payment gateway. There will be timely discounts available on the program fee which can be noticed on the website.

Self-Paced Learning

29040 33000

Start My Free Trial

+91 955 010 2466

Data Scientist Masters Program

6

245

129

12

Program Features

As per your convenience

Never miss a class

Personal Learning Manager

Lifetime Access

Program Syllabus

Tableau Online Training

Course Syllabus

Data Science Course

Course Syllabus

Machine Learning Course

Course Syllabus

Artificial Intelligence Online Course

Course Syllabus

Deep Learning Course

Course Syllabus

TensorFlow Training

Course Syllabus

Program Fees

Don't find suitable time ?

77400 86000

FAQ's

Reviews

4.9/5

Our Masters Course Alumni work for amazing companies

Like reviews..? Enroll Now

Get a certificate when you complete a course

Log In to start Learning

Self-Paced Learning

29040 33000

Start My Free Trial

+91 955 010 2466

Data Scientist Masters Program

6

245

129

12

Program Features

As per your convenience

Never miss a class

Personal Learning Manager

Lifetime Access

Program Syllabus

Tableau Online Training

Course Syllabus

Data Science Course

Course Syllabus

Machine Learning Course

Course Syllabus

Artificial Intelligence Online Course

Course Syllabus

Deep Learning Course

Course Syllabus

TensorFlow Training

Course Syllabus

Program Fees

Don't find suitable time ?

77400 86000

FAQ's

Reviews

4.9/5

Our Masters Course Alumni work for amazing companies

Like reviews..? Enroll Now

Get a certificate when you complete a course

Related Masters Programs

Cloud Architect Masters Program

Artificial Intelligence Masters Program

Microsoft Azure Certification Masters Program

Request More