Taro Logo

Mastered Big Data Pipelines

Data Engineer
Current Employee
Has worked at Capital One for less than 1 year
October 27, 2022
5.0
RecommendsPositive Outlook
Pros

• Built complex Terabyte-scale data pipelines and presentation layers using Cascading for consumption by loan performance analysis models. • Extracted TBs of data from SQL Server and Teradata to the One Lake. • Worked with the Spark ecosystem using Spark SQL and Scala queries on different formats like Parquet files and CSV files. • Implemented Spark using Scala and Spark SQL for faster testing and processing of data, responsible for managing data from different sources.

Cons

• Big Data Technologies: Hadoop, MapReduce, Hive, Pig, Scoop, Spark, Kafka. • Machine Learning: Classification, Regression, Clustering, Feature engineering, Ensemble Learning. • Statistics: Bayesian analysis, Time series and Multivariate data analysis. • Languages: Python, R, SQL, Scala. • NoSQL Databases: HBase, Cassandra, MongoDB. • Cloud Platform: Azure, AWS. • DBs & Tools: SQL Server, Azure Data Factory, ETL – SSIS/SSRS/SSAS, Tableau, Power BI. • Version control: GIT, TFS, Confluence, BitBucket.

Additional Ratings

Work/Life Balance
5.0
Culture and Values
5.0
Diversity, Equity, and Inclusion
5.0
Career Opportunities
5.0
Compensation and Benefits
5.0
Senior Management
5.0

Was this helpful?

Capital One Interview Experiences