Teradata

Teradata

Top Interview Questions

About Teradata

 

Teradata: A Comprehensive Overview

Introduction to Teradata

Teradata is a powerful enterprise-level data warehousing and analytics platform designed to handle large volumes of data efficiently. It is widely used by organizations that require high-performance analytics, complex query processing, and scalability. Teradata is best known for its ability to manage massive parallel processing (MPP) systems, which allow it to process huge datasets quickly and reliably.

Founded in 1979 as a part of NCR Corporation, Teradata became an independent company in 2007. Over the years, it has evolved from a traditional on-premises data warehouse solution to a modern, cloud-enabled analytics platform that supports advanced analytics, machine learning, and hybrid architectures.


What is Teradata?

Teradata is primarily a relational database management system (RDBMS) optimized for data warehousing and analytics. Unlike traditional databases that are designed for transaction processing, Teradata is designed for Online Analytical Processing (OLAP). It allows users to run complex queries on large datasets without significantly affecting system performance.

Teradata supports standard SQL, making it easy for developers, data analysts, and BI professionals to interact with the system. It is commonly used in industries such as banking, telecommunications, retail, healthcare, and airlines, where data volumes are extremely large and business decisions depend on accurate and fast analytics.


Key Features of Teradata

1. Massively Parallel Processing (MPP)

Teradata’s core strength lies in its MPP architecture. Data and queries are distributed across multiple nodes, allowing tasks to be processed in parallel. This significantly improves query performance and scalability.

2. Linear Scalability

Teradata can scale linearly, meaning performance improves as more nodes are added. Organizations can expand their systems as data grows without major redesigns.

3. High Performance and Speed

Teradata efficiently handles complex joins, aggregations, and large scans, making it ideal for analytical workloads involving terabytes or petabytes of data.

4. SQL Support

Teradata uses ANSI-compliant SQL, enabling users to write complex queries easily. Advanced SQL features such as window functions and analytical functions are well supported.

5. Data Distribution and Indexing

Teradata automatically distributes data evenly across the system using Primary Indexes (PI), which reduces data skew and improves performance.

6. Workload Management

Teradata includes powerful workload management tools that prioritize critical queries and ensure fair resource allocation among users.


Teradata Architecture

Teradata architecture is designed to support parallel processing and high availability. The main components include:

1. Nodes

A Teradata system consists of multiple nodes. Each node is an independent processing unit with its own CPU, memory, and storage.

2. Parsing Engine (PE)

The Parsing Engine is responsible for:

  • Receiving SQL requests from users

  • Checking syntax and semantics

  • Creating execution plans

  • Managing sessions and security

3. Access Module Processor (AMP)

AMPs are the workhorses of Teradata. They:

  • Store data on disk

  • Retrieve and manipulate data

  • Perform aggregations and joins

  • Write results back to disk

Each AMP processes data independently, enabling parallel execution.

4. BYNET

BYNET is the high-speed communication layer that connects nodes and enables data transfer between AMPs.


Teradata Indexes

Indexes play a critical role in Teradata performance.

Primary Index (PI)

  • Determines how data is distributed across AMPs

  • Can be unique or non-unique

  • Helps in even data distribution and faster access

Secondary Index (SI)

  • Optional index for faster access when PI is not used in queries

  • Can be unique or non-unique

  • Consumes additional storage

Join Index

  • Pre-joins tables to improve query performance

  • Useful for complex queries involving multiple tables


Teradata SQL and Query Processing

Teradata SQL is similar to standard SQL but includes some extensions. Query processing follows these steps:

  1. User submits an SQL query

  2. Parsing Engine validates and optimizes the query

  3. Query plan is generated

  4. AMPs execute the plan in parallel

  5. Results are returned to the user

Teradata’s optimizer is cost-based and highly sophisticated, choosing the most efficient execution path based on statistics and data distribution.


Data Loading in Teradata

Efficient data loading is essential for data warehouses. Teradata provides several utilities:

FastLoad

  • Used for loading large volumes of data into empty tables

  • Very fast but does not allow duplicate rows

MultiLoad

  • Used for bulk loading, updating, deleting, and upserting data

  • Works on populated tables

TPump (Teradata Parallel Transporter – TPT)

  • Used for near-real-time data loading

  • Suitable for small batches and continuous loads

Teradata Parallel Transporter (TPT)

  • A unified framework that replaces FastLoad, MultiLoad, and TPump

  • Supports both batch and continuous loads


Teradata Deployment Options

1. On-Premises Teradata

Traditional deployment where Teradata runs on dedicated hardware in a company’s data center. Suitable for organizations with strict security or regulatory requirements.

2. Teradata Vantage

Teradata Vantage is the modern analytics platform that unifies data warehousing, data lakes, and advanced analytics.

3. Cloud Teradata

Teradata supports major cloud platforms:

  • AWS

  • Microsoft Azure

  • Google Cloud Platform

Cloud deployment offers flexibility, scalability, and reduced infrastructure costs.

4. Hybrid Architecture

Combines on-premises and cloud environments, allowing organizations to move workloads gradually to the cloud.


Advanced Analytics and AI in Teradata

Teradata supports advanced analytics and AI/ML workloads through:

  • In-database analytics

  • Integration with Python, R, and SAS

  • Machine learning functions

  • Graph and time-series analytics

With Teradata Vantage, users can run analytics where the data resides, reducing data movement and improving performance.


Security in Teradata

Security is a critical component of Teradata systems. Key security features include:

  • User authentication and authorization

  • Role-based access control

  • Data encryption at rest and in transit

  • Auditing and logging

  • Compliance with industry standards

These features ensure data confidentiality, integrity, and availability.


Advantages of Teradata

  • Handles extremely large datasets efficiently

  • Excellent performance for complex analytical queries

  • Highly scalable and reliable

  • Strong workload management

  • Mature and stable platform

  • Widely used in large enterprises


Limitations of Teradata

  • High cost compared to some modern cloud-native databases

  • Requires skilled professionals for administration

  • Not ideal for small datasets or OLTP workloads

  • Migration and licensing can be complex


Teradata Use Cases

  • Enterprise data warehousing

  • Customer behavior analytics

  • Financial risk analysis

  • Fraud detection

  • Supply chain analytics

  • Telecom call detail record (CDR) analysis


Career Opportunities in Teradata

Professionals skilled in Teradata can work as:

  • Teradata Developer

  • Data Warehouse Engineer

  • Data Analyst

  • BI Developer

  • Database Administrator (DBA)

Key skills include SQL, data modeling, performance tuning, and ETL tools.

Fresher Interview Questions

 

1. What is Teradata?

Answer:
Teradata is a massively parallel processing (MPP) relational database management system (RDBMS) designed to handle very large volumes of data. It is mainly used in data warehousing and analytics environments where high performance, scalability, and reliability are required.

Teradata distributes data across multiple nodes and processes queries in parallel, which makes it extremely fast for complex analytical queries on large datasets.


2. What are the key features of Teradata?

Answer:
Key features of Teradata include:

  • Massively Parallel Processing (MPP) architecture

  • High scalability (supports petabytes of data)

  • Shared-nothing architecture

  • Automatic data distribution

  • Advanced query optimization

  • High availability and fault tolerance

  • Support for ANSI SQL


3. What is MPP (Massively Parallel Processing)?

Answer:
MPP is an architecture where multiple processors work independently and in parallel to process data. In Teradata:

  • Each processor handles a portion of the data

  • Queries are divided into smaller tasks

  • Tasks are executed simultaneously

This results in faster query execution and better performance for large datasets.


4. What is a Teradata node?

Answer:
A node is a physical or virtual server in a Teradata system. Each node contains:

  • CPUs

  • Memory

  • Disk storage

  • AMPs (Access Module Processors)

Multiple nodes work together to process queries in parallel.


5. What is an AMP in Teradata?

Answer:
AMP stands for Access Module Processor. It is the fundamental unit of work in Teradata. Each AMP:

  • Stores a portion of the database

  • Performs row-level operations such as sorting, aggregating, and joining

  • Works independently of other AMPs

The number of AMPs directly impacts performance.


6. What is a PE (Parsing Engine)?

Answer:
The Parsing Engine (PE) is responsible for:

  • Receiving SQL queries from users

  • Checking syntax and semantics

  • Optimizing queries

  • Generating execution plans

  • Sending work instructions to AMPs

The PE does not store data.


7. What is BYNET?

Answer:
BYNET is the high-speed communication network in Teradata that connects:

  • Parsing Engines

  • AMPs

  • Nodes

It enables fast data transfer and synchronization between system components.


8. Explain Teradata architecture.

Answer:
Teradata architecture consists of:

  1. Client Layer – Tools like SQL Assistant, BTEQ, or BI tools

  2. Parsing Engine Layer – Parses and optimizes queries

  3. AMP Layer – Stores and processes data

  4. BYNET – Communication layer between components

This architecture ensures high performance and scalability.


9. What is a Primary Index (PI)?

Answer:
A Primary Index determines how rows are distributed across AMPs. It:

  • Is mandatory for every table

  • Can be unique or non-unique

  • Controls data distribution

  • Improves query performance when used in WHERE clauses


10. Difference between Unique PI and Non-Unique PI.

Answer:

Unique PI Non-Unique PI
Each value is unique Duplicate values allowed
Faster access Slightly slower
Ensures even distribution May cause data skew

11. What is a Secondary Index?

Answer:
A Secondary Index (SI) provides an alternative access path to data. It is used when queries do not use the Primary Index.

Types:

  • Unique Secondary Index (USI)

  • Non-Unique Secondary Index (NUSI)


12. What is data skew in Teradata?

Answer:
Data skew occurs when data is unevenly distributed across AMPs. This leads to:

  • Performance issues

  • Longer query execution time

  • Overloaded AMPs

Choosing the right Primary Index helps reduce data skew.


13. What is a hash function in Teradata?

Answer:
Teradata uses a hash function to:

  • Convert Primary Index values into hash values

  • Determine which AMP stores a row

The hash value ensures even data distribution.


14. What is a fallback table?

Answer:
A fallback table stores a duplicate copy of data on a different AMP. It provides:

  • Data protection

  • High availability in case of AMP failure

Fallback requires additional storage space.


15. What is a multiset table?

Answer:
A multiset table allows duplicate rows. Teradata supports:

  • SET tables – Do not allow duplicate rows

  • MULTISET tables – Allow duplicate rows

Multiset tables are commonly used in data warehouses.


16. What is a volatile table?

Answer:
A volatile table:

  • Exists only during a user session

  • Is stored in memory (or spool)

  • Is automatically dropped at session end

  • Is mainly used for intermediate results


17. What is spool space?

Answer:
Spool space is temporary disk space used to:

  • Store intermediate query results

  • Sort and join data during query execution

Each user is allocated a specific amount of spool space.


18. Difference between DELETE and DROP.

Answer:

DELETE DROP
Removes rows Removes entire table
Can use WHERE clause Cannot use WHERE
Table structure remains Table structure removed

19. What is BTEQ?

Answer:
BTEQ (Basic Teradata Query) is a command-line utility used to:

  • Run SQL scripts

  • Automate batch jobs

  • Export and import data


20. What are the advantages of Teradata?

Answer:
Advantages include:

  • Excellent performance for large data

  • Linear scalability

  • High reliability and availability

  • Strong support for analytics

  • Efficient data distribution


21. Difference between Teradata and Oracle.

Answer:

Teradata Oracle
MPP architecture SMP architecture
Best for data warehousing Best for OLTP
Automatic data distribution Manual tuning required

22. What is statistics in Teradata?

Answer:
Statistics provide information about data distribution. The optimizer uses statistics to:

  • Choose efficient execution plans

  • Improve query performance

Statistics are collected using the COLLECT STATISTICS command.


23. What is a join index?

Answer:
A Join Index is a pre-joined table that improves query performance by reducing join processing at runtime.


24. What is the difference between SET and MULTISET tables?

Answer:

SET Table MULTISET Table
No duplicate rows Duplicate rows allowed
Slower inserts Faster inserts
Default table type Preferred in DW

25. Why is Teradata mainly used in data warehousing?

Answer:
Teradata is ideal for data warehousing because it:

  • Handles huge data volumes

  • Executes complex analytical queries efficiently

  • Supports parallel processing

  • Provides high scalability and reliability


26. What is a Teradata database?

Answer:
In Teradata, a database is a logical container that holds:

  • Tables

  • Views

  • Indexes

  • Macros

  • Stored procedures

A database also controls space allocation and user access.


27. What is a Teradata user?

Answer:
A user is a special type of database that:

  • Can log in to Teradata

  • Owns objects (tables, views)

  • Has permissions and spool space

Every user is technically a database, but not every database is a user.


28. Difference between USER and DATABASE in Teradata.

Answer:

USER DATABASE
Can log in Cannot log in
Has password No password
Has spool space No spool space
Used by people/apps Used for storage

29. What is a view in Teradata?

Answer:
A view is a virtual table created using a SELECT query. It:

  • Does not store data physically

  • Stores only the query definition

  • Is used for security and simplicity


30. What is a macro in Teradata?

Answer:
A macro is a stored set of SQL statements that can be executed with a single call.
Benefits:

  • Reduces repeated SQL coding

  • Improves consistency

  • Easy execution


31. Difference between MACRO and STORED PROCEDURE.

Answer:

Macro Stored Procedure
Only SQL statements SQL + control logic
No IF/LOOP Supports IF, LOOP
Faster Slightly slower
Simple tasks Complex logic

32. What is a stored procedure?

Answer:
A stored procedure is a program stored in the database that:

  • Contains SQL and procedural logic

  • Supports conditions and loops

  • Can accept input/output parameters

Used for complex business logic.


33. What is a surrogate key?

Answer:
A surrogate key is an artificial key (usually a number) used instead of a natural key.
Example:

  • Customer_ID generated by system

It helps in:

  • Performance

  • Avoiding business key changes


34. What is a natural key?

Answer:
A natural key is a real-world attribute that uniquely identifies a record.
Example:

  • PAN number

  • Email ID


35. What is normalization?

Answer:
Normalization is the process of:

  • Organizing data

  • Removing redundancy

  • Improving data integrity

In Teradata data warehouses, normalization is often reduced for performance.


36. What is denormalization?

Answer:
Denormalization is the process of:

  • Combining tables

  • Reducing joins

  • Improving query performance

It is commonly used in data warehouses.


37. What is a fact table?

Answer:
A fact table stores measurable data such as:

  • Sales amount

  • Quantity

  • Revenue

It usually contains foreign keys to dimension tables.


38. What is a dimension table?

Answer:
A dimension table contains descriptive information like:

  • Customer

  • Product

  • Time

  • Location

Used for filtering and grouping data.


39. What is a star schema?

Answer:
A star schema has:

  • One fact table in the center

  • Multiple dimension tables around it

It is simple and provides fast query performance.


40. What is a snowflake schema?

Answer:
A snowflake schema is an extension of star schema where:

  • Dimension tables are normalized

  • More joins are required


41. Difference between STAR and SNOWFLAKE schema.

Answer:

Star Schema Snowflake Schema
Simple design Complex design
Fewer joins More joins
Faster queries Slower queries
More storage Less storage

42. What is a FULL TABLE SCAN?

Answer:
A full table scan occurs when:

  • No index is used

  • Teradata scans all rows of the table

It is slower and should be avoided for large tables.


43. What is EXPLAIN in Teradata?

Answer:
The EXPLAIN command shows:

  • How Teradata executes a query

  • Whether indexes are used

  • Data movement between AMPs

Used for performance tuning.


44. What is a NO PRIMARY INDEX (NoPI) table?

Answer:
A NoPI table:

  • Has no primary index

  • Rows are distributed randomly

  • Commonly used for staging data


45. What is a hash join?

Answer:
A hash join:

  • Uses hash values to match rows

  • Is efficient for large tables

  • Is commonly used in Teradata


46. What is a merge join?

Answer:
A merge join:

  • Requires sorted data

  • Compares rows sequentially

  • Used when data is already sorted


47. What is a cartesian join?

Answer:
A cartesian join:

  • Occurs when join condition is missing

  • Produces all possible combinations

  • Very expensive and should be avoided


48. What is partitioned primary index (PPI)?

Answer:
PPI divides data based on a column like:

  • Date

  • Region

It improves performance for range queries.


49. What is a secondary index subtable?

Answer:
A secondary index subtable:

  • Stores index values separately

  • Points to base table rows

  • Requires extra storage


50. What is fastload?

Answer:
FastLoad is used to:

  • Load large volumes of data into empty tables

  • Load data quickly

  • Does not allow duplicate rows


51. What is multiload?

Answer:
MultiLoad is used to:

  • Load, update, delete data

  • Work with large tables

  • Support restart capability


52. Difference between FASTLOAD and MULTILOAD.

Answer:

FastLoad MultiLoad
Insert only Insert, Update, Delete
Empty table Populated table
No duplicates Allows duplicates
Very fast Slightly slower

53. What is TPT (Teradata Parallel Transporter)?

Answer:
TPT is a modern utility that:

  • Replaces FastLoad, MultiLoad, Export

  • Supports parallel data movement

  • Improves performance


54. What is locking in Teradata?

Answer:
Locking ensures:

  • Data consistency

  • Prevents conflicts during access

Types:

  • Read lock

  • Write lock

  • Exclusive lock


55. What is deadlock?

Answer:
Deadlock occurs when:

  • Two transactions wait for each other

  • Neither can proceed

Teradata automatically detects and resolves deadlocks.


56. What is a checkpoint?

Answer:
A checkpoint allows:

  • Restarting a job from failure point

  • Reducing reprocessing time

Common in MultiLoad and FastLoad.


57. What is referential integrity?

Answer:
Referential integrity ensures:

  • Child records reference valid parent records

  • Data consistency across tables


58. What is ANSI mode in Teradata?

Answer:
ANSI mode enforces:

  • Standard SQL behavior

  • Strict rules for NULL comparison

  • Transaction control


59. What is Teradata SQL Assistant?

Answer:
It is a GUI tool used to:

  • Run SQL queries

  • View results

  • Monitor sessions


60. What is the role of Teradata DBA?

Answer:
A Teradata DBA:

  • Manages users and space

  • Monitors performance

  • Handles backups and recovery

  • Ensures system availability

Experienced Interview Questions

 

1. Explain Teradata architecture in detail.

Answer:
Teradata follows a Massively Parallel Processing (MPP) and shared-nothing architecture.

Main components:

  • Client Layer – BTEQ, SQL Assistant, BI tools

  • Parsing Engine (PE) – Parses SQL, checks syntax, creates execution plan

  • BYNET – High-speed network for communication

  • AMPs – Store data and execute queries in parallel

This architecture enables linear scalability and high performance.


2. How does Teradata distribute data across AMPs?

Answer:
Data distribution is based on:

  1. Primary Index value

  2. Hashing algorithm

  3. Hash Map

The PI value is hashed and mapped to a specific AMP, ensuring even data distribution.


3. What is data skew? How do you identify and resolve it?

Answer:
Data skew occurs when data is unevenly distributed across AMPs.

Identification:

  • Check AMP usage in Viewpoint

  • Analyze EXPLAIN plan

  • Query DBC.DISKSPACE

Resolution:

  • Choose a better Primary Index

  • Use composite PI

  • Use NoPI tables for staging

  • Redistribute data


4. Difference between Primary Index and Partitioned Primary Index (PPI).

Answer:

Primary Index PPI
Distributes data across AMPs Divides data within AMP
Improves join performance Improves range queries
Hash-based Range-based

5. What is a Join Index and when do you use it?

Answer:
A Join Index is a physical structure that stores pre-joined data.

Used when:

  • Frequent joins on same columns

  • Complex joins on large tables

  • Query performance is critical


6. Explain different join strategies in Teradata.

Answer:

  • Merge Join – Requires sorted data

  • Hash Join – Uses hash values, very common

  • Nested Join – Used for small tables

  • Product Join – Cartesian join (to be avoided)


7. How do statistics impact query performance?

Answer:
Statistics help the optimizer:

  • Estimate row counts

  • Choose join order

  • Select join method

Without statistics, Teradata may choose inefficient execution plans.


8. How do you collect statistics in Teradata?

Answer:

COLLECT STATISTICS COLUMN(column_name) ON table_name;

Best practice:

  • Collect stats on PI, join columns, filters


9. What is spool space and how do you manage spool issues?

Answer:
Spool space stores intermediate query results.

Management:

  • Optimize queries

  • Use proper WHERE conditions

  • Drop unnecessary volatile tables

  • Increase user spool allocation


10. Difference between SET and MULTISET tables in real-time.

Answer:

  • SET tables avoid duplicates but slow inserts

  • MULTISET tables allow duplicates and faster loads
    ➑️ MULTISET is preferred in data warehouses.


11. What is NoPI table and where is it used?

Answer:
NoPI tables have:

  • No primary index

  • Random row distribution

Used in:

  • Staging tables

  • Temporary processing


12. Explain FASTLOAD, MULTILOAD, and TPT.

Answer:

Utility Use Case
FastLoad Initial bulk load
MultiLoad Update/Delete/Insert
TPT Modern parallel utility

13. How does Teradata handle concurrency?

Answer:

  • Uses locking mechanisms

  • Supports multiple sessions

  • Manages workload via TASM


14. What is TASM?

Answer:
Teradata Active System Management (TASM) manages:

  • Workload prioritization

  • Resource allocation

  • Query throttling


15. What is fallback and when do you use it?

Answer:
Fallback stores duplicate data on different AMPs.

Used when:

  • High availability is required

  • Data loss is unacceptable


16. Explain different locking levels in Teradata.

Answer:

  • ACCESS – Read without blocking

  • READ – Consistent read

  • WRITE – Allows read but blocks write

  • EXCLUSIVE – Blocks all access


17. How do you tune a slow-running query?

Answer:
Steps:

  1. Check EXPLAIN plan

  2. Verify statistics

  3. Check data skew

  4. Optimize joins

  5. Reduce spool usage

  6. Use proper indexes


18. What is a volatile table and how is it used in projects?

Answer:
Volatile tables:

  • Exist for session duration

  • Used for intermediate results

  • Improve performance by reducing I/O


19. Difference between DELETE, TRUNCATE, and DROP.

Answer:

DELETE TRUNCATE DROP
Row-level Removes all rows Removes table
WHERE allowed No WHERE No table left
Slow Fast Permanent

20. What is Query Banding?

Answer:
Query banding attaches metadata to queries, useful for:

  • Monitoring

  • Auditing

  • Workload management


21. How does Teradata optimizer work?

Answer:
The optimizer:

  • Uses cost-based optimization

  • Analyzes statistics

  • Chooses best execution path


22. Explain fallback vs RAID.

Answer:

  • Fallback – Logical data protection

  • RAID – Physical disk protection

Both are used together for high availability.


23. What is a temporal table?

Answer:
Temporal tables store:

  • Historical data

  • Valid time and transaction time

Used for auditing and tracking changes.


24. How do you handle large DELETE operations?

Answer:

  • Use batch deletes

  • Use partitioning

  • Avoid full table locks


25. Explain real-time Teradata project architecture.

Answer:
Typical architecture:

  • Source systems → Staging (NoPI)

  • Transformation layer

  • Core warehouse

  • Reporting layer


26. What is skewed join and how do you fix it?

Answer:
Occurs when one AMP processes most join rows.

Fix:

  • Change join order

  • Redistribute smaller table

  • Use statistics


27. Difference between USI and NUSI.

Answer:

USI NUSI
Unique values Duplicate values
Fast access Slower
Stored as subtable Stored as subtable

28. What is a hash map?

Answer:
A hash map maps hash values to AMPs and ensures even distribution.


29. Explain workload management in Teradata.

Answer:
Handled by:

  • TASM

  • Priority scheduling

  • Query throttling


30. Why is Teradata preferred for large-scale analytics?

Answer:
Because it offers:

  • Linear scalability

  • High performance

  • Parallel processing

  • Robust workload management


31. How do you choose a good Primary Index in a real project?

Answer:
A good Primary Index (PI) should:

  • Be frequently used in WHERE and JOIN clauses

  • Have high cardinality

  • Ensure even data distribution

  • Be stable (values should not change)

   Avoid:

  • Low-cardinality columns (gender, status)

  • Volatile business keys


32. What is a Composite Primary Index and when is it used?

Answer:
A Composite PI consists of multiple columns.

Used when:

  • Single column doesn’t distribute data evenly

  • Queries frequently use multiple columns together

Example:

PRIMARY INDEX (customer_id, order_date)

33. What is AMP-local vs redistributed join?

Answer:

  • AMP-local join – Data already on same AMP (fastest)

  • Redistributed join – Rows moved between AMPs

  • Duplicated join – Smaller table copied to all AMPs

Best practice: Aim for AMP-local joins.


34. What is duplication in joins and when is it used?

Answer:
Duplication copies a small table to all AMPs to avoid redistribution.

Used when:

  • One table is very small

  • Reduces data movement


35. What is spool skew and how do you fix it?

Answer:
Spool skew occurs when:

  • Spool usage is uneven across AMPs

Fix:

  • Change join order

  • Use better PI

  • Collect statistics

  • Break query into steps


36. How do you analyze a long-running query in production?

Answer:
Steps:

  1. Check Viewpoint

  2. Review EXPLAIN plan

  3. Check statistics

  4. Identify skew

  5. Monitor spool usage

  6. Tune joins and filters


37. What is QueryGrid?

Answer:
QueryGrid allows Teradata to:

  • Query external systems (Hadoop, Oracle)

  • Avoid data movement

  • Perform cross-platform analytics


38. Difference between ACCESS lock and READ lock.

Answer:

ACCESS READ
No blocking Blocks writes
Faster Consistent reads
Dirty reads possible No dirty reads

39. What happens when statistics are stale?

Answer:
Stale stats lead to:

  • Poor execution plans

  • Longer run times

  • Increased spool usage

Solution: Recollect statistics regularly.


40. How does Teradata handle deadlocks?

Answer:
Teradata:

  • Detects deadlocks automatically

  • Aborts one transaction

  • Rolls back changes


41. Explain Perm space, Spool space, and Temp space.

Answer:

  • Perm – Permanent table storage

  • Spool – Query processing

  • Temp – Global temporary tables


42. How do you manage space issues in Teradata?

Answer:

  • Drop unused tables

  • Archive old data

  • Monitor disk usage

  • Increase space allocation


43. What is a Global Temporary Table (GTT)?

Answer:
A GTT:

  • Structure persists

  • Data is session-specific

  • Used for temporary processing


44. Difference between Volatile Table and GTT.

Answer:

Volatile Table GTT
Dropped at logout Structure persists
Session-based Multi-session
Faster setup Reusable

45. What is Temporal table use case?

Answer:
Used for:

  • Audit history

  • Slowly Changing Dimensions

  • Time travel queries


46. How do you load data into large fact tables?

Answer:

  • Use TPT

  • Use NoPI staging

  • Batch loads

  • Validate after load


47. What is Checkpoint in MultiLoad?

Answer:
Checkpoint:

  • Saves job progress

  • Allows restart from failure


48. Explain FastExport.

Answer:
FastExport:

  • Extracts large volumes of data

  • Uses parallelism

  • Faster than SELECT


49. What is a Hash Index?

Answer:
A Hash Index:

  • Improves equality lookups

  • Uses hashing

  • Alternative to NUSI


50. Difference between Join Index and Hash Index.

Answer:

Join Index Hash Index
Pre-joined data Single-table lookup
Improves joins Improves filters
Uses more space Lightweight

51. How do you handle duplicate records in Teradata?

Answer:

  • Use SET tables

  • Use ROW_NUMBER() with QUALIFY

  • Remove duplicates via staging


52. What is QUALIFY and why is it important?

Answer:
QUALIFY filters result after window functions.

Example:

SELECT *
FROM sales
QUALIFY ROW_NUMBER() OVER (PARTITION BY cust_id ORDER BY date) = 1;

53. What is the difference between WHERE, HAVING, and QUALIFY?

Answer:

WHERE HAVING QUALIFY
Filters rows Filters groups Filters window results

54. What is workload throttling?

Answer:
Limits number of concurrent queries to:

  • Prevent system overload

  • Maintain performance

Handled by TASM.


55. What is a Tactical query?

Answer:
Short-running, high-priority queries such as:

  • Dashboard lookups

  • Single-row queries


56. Explain Production vs Development Teradata environments.

Answer:

  • DEV – Testing and development

  • QA/UAT – Validation

  • PROD – Live business data


57. How do you migrate Teradata code between environments?

Answer:

  • Use scripts

  • Validate permissions

  • Test performance

  • Monitor post-deployment


58. What is CDC in Teradata?

Answer:
Change Data Capture tracks:

  • Inserts

  • Updates

  • Deletes

Used in incremental loads.


59. What is the role of Teradata Viewpoint?

Answer:
Viewpoint is used for:

  • Monitoring sessions

  • Query performance

  • Resource usage

  • System health


60. Explain real-time Teradata interview scenario.

Answer:
Scenario: Query suddenly takes 10x more time
Solution:

  • Check stats freshness

  • Identify skew

  • Review EXPLAIN

  • Apply tuning