1.What is MS-BI?
* Microsoft Business Intelligence is designed to empower business users through the BI Tools.
* Users can create their own reports and analysis (examination and evaluation of relevant information) and easily share and collaborate with their colleagues.
* Microsoft Business Intelligence is designed with IT manageability and cost effectiveness.
2.Core Elements of SSIS Designer:
3.What is difference between Flat files and Raw files?
* A flat file that stores data in a plain text file.
* Each line of the text file holds one record, with fields separated by delimiters, such as commas or tabs.
* Flat files are data files that contain records with no structured relationships.
Raw Files :
* A raw file is a collection of unprocessed data(Binary format).
* This means the file has not been altered, compressed, or manipulated in any way by the computer.
* Raw files are often used as data files by software programs that load and process the data.
4.Explain Derived Column Transformation with an example
5. Explain Fuzzy Lookup Transformation with an example.
_similarity : How many similar indicates or How much indicates the lookup rows.
_confidence : How much confidence in the system is for identifying the similarities.
6. Difference between Execute TSQL Task and Execute SQL Task
In SSIS there is one tasks Execute TSQL task which is similar to Execute SQL task. Will see what is the difference between two.
Execute TSQL Task:
Pros: Takes less memory, faster perfomance
Cons: Output into variable not supported, Only supports ADO.net connection
Execute SQL Task:
Pros: Support output into variables and multiple types of connection, parameterized query possible.
Cons: Takes more memory, slower performance compared to TSQL task.
7. Explain Precedence Constraints with an example
A task will only execute if the condition that is set by the precedence constraint preceding the task is met. By using these constraints, it will choose different execution paths depending on the success or failure of other tasks.
Success – Workflow will proceed when the preceding container executes successfully. Indicated in control flow by a solid green line.
Failure – Workflow will proceed when the preceding container’s execution results in a failure. Indicated in control flow by a solid red line.
Completion – Workflow will proceed when the preceding container’s execution completes, regardless of success or failure. Indicated in control flow by a solid blue line.
Expression/Constraint with logical AND – workflow will proceed when specified expression and constraints evaluate to true. Indicated in control flow by a solid color line along with a small ‘fx’ icon next to it. Color of line depends on logical constraint chosen (e.g. success = green, completion = blue).
8. Different between Control Flow and Data Flow?
Control flow is for designing the flow of the package. Data flow is for ETL process.Data Flow is the subset of control flow
There will be only one control flow while multiple dataflow can exists.Data flow cannot work without a control flow
All process base tasks are part of control flow while ETL related tasks are the part of Dataflow which is again a subset of control flow.
9. What is Check point?
Checkpoint is the property in SSIS which enables the project to restart from the point of failure. When we set the property to true package create the checkpoint file which stores the information about package execution and use to restart package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time the package runs.
10.Explain Bulk insert task with an example
The Bulk Insert task can transfer data only from a text file into a SQL Server table or view.
If the destination table or view already contains data, the new data is appended to the existing data when the Bulk Insert task runs.
we cannot perform any ETL operations during the data transfer.
Destination table should be already present in sql server.
11.Explain Import Columns and Export Columns Transformation
* The Import Column transformation is used to load binary data (photos, documents, media etc.) from the file system into a Data Flow.
* From the Data Flow it can then be loaded into a table, moved to a different location, modified – whatever you need to do.
* It is the reverse of the Export Column transformation, which pushes binary data out of a Data Flow and into the file system.
* The Export Column transformation reads data in a data flow and inserts the data into a file.
* For example, if the data flow contains product information, such as a picture of each product, you could use the Export Column transformation to save the images to files.
* Specify the data columns and the columns that contain the path of files to which to write the data.
* Specify whether the data-insertion operation appends or truncates existing files.
12.What is the difference between OLTP and OLAP?
1.Database systems designed for Transactions purpose.
2.Data is frequently modified (Volatile data).
3.More number of users.
4.Normalized tables maintained.
5.More no.of Indexes not recommended.
7.Database Systems maintain current data (< one year).
8.Small amount of data.
1.DWH Systems designed for Analysis and Reporting purpose.
2.Data modifications are not frequently. (Non volatile data).
3.Less number of users.(Data Analyst Team)
4. Denormalized tables are maintained.
5.More no.of Indexes recommended.
7.DWH maintain historical data ( 1 – 10 years of data).
8.Huge data (Very large database).
13. What is the size of the Cube in your last Project?
Answer to this question varies from project to project and mainly depends on how BIG is your database and how COMPLEX the database design is. Generally for the database with a TRANSACTION TABLE of 50 cores records, the cube size will be around 100GB. So, better go with 100GB as answer to this question.
14.What is size of the database in your last Project?
The database size will be 600 to 800GB for which the cube will come to 100 GB. So go with 800GB for this question.
15.How frequently you process the cube?
You have to be very careful here.
Frequency of processing cube depends on HOW FREQUENTLY YOU ARE GETTING NEW DATA?
Every day at 10pm IST.
we get new data either Daily or Weekly. So you can say that the processing of the cube will be done either Daily or Weekly.
16.What is the toughest challenge you face in your Project?
There are couple of this where we face difficulty.
* While working on RELATIONSHIPS between Measure Groups and Dimensions.
* Working on Complex calculations
* Performance tuning
* Working on KPI creations
* Working on partitions.
17.How many dimensions in your last cube?
5 tables, around 50 dimensions attributes
18.How many measure groups in your last cube?
Total 10 to 20 .
19.What is the Schema of your last cube?
20.Why not STAR Schema ?
My data base design doesn’t support STAR Schema.
21.What are the different relationships that you are used in your cube?
We have 6 types of Relationship:
e.many to many Relationship
22.How many Resources worked on same Cube in your Project?
Only 2 and one in morning shift and another in Evening shift.
23.How much time it take to Process the Cube?
This is Very very important question.This again depends on the Size of database,Complexity of the database and your server settings. For database with 50 cr transaction records, it generally takes 3 hrs.
24.How many Calculation you done in Your Project?
I answer more than 100 and if you tell the same then you are caught unless you are super good in MDX. Best answer for you is “Worked on 50 calculations”.
Python is a dynamic interrupted language which is used in wide varieties of applications. It is very interactive object oriented and high-level programming language.
Tableau is a Software company that caters interactive data visualization products that provide Business Intelligence services. The company’s Head Quarters is in Seattle, USA.
Micro Strategy is one of the few independent and publicly trading Business Intelligence software provider in the market. The firm is operational in 27 Countries around the globe.
Pega Systems Inc. is a Cambridge, Massachusetts based Software Company. It is known for developing software for Customer Relationship Management (CRM) and Business process Management (BPM).
Workday specialises in providing Human Capital Management, Financial Management and payroll in online domain.It is a major web based ERP software vendor.
Power BI is business analytics service by Microsoft. With Power BI, end users can develop reports and dashboards without depending on IT staff or Database Administrator.