Industry Experience: 2 Years +
Desired Skills:
• Big Data - Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distributions of Horton Works or Cloudera or better to have latest merged frameworks experience. • MapR, Pig, Hive, Python, Sqoop, Spark, Mesos, luigi, Azkaban, Yarn • Kafka, Oozie, Zookeeper, Gangilia • AWS Suite - EC2, EMR, Redshift and similar tools • ETL Tools - Business Objects Data Integrator r2/r3.2, Data Quality, Data Insight, Data Federator, Universe Data Cleanse (UDC) • Databases - Oracle 9i/10g, SQL Server 2005/2008 (SSIS,SSAS, SSRS) • No SQL Databases including Open Source DBs. Mongo DB or Graph QL etc. • Not essential to know all the above listed tools and technologies, but preference to know most or equivalent / similar tools.
Desired Experience:
• Working experience building, designing, configuring medium to large scale Hadoop environments. • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Big Data eco system Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Spark, Kafka, Zookeeper, Yarn • Experienced in monitoring Hadoop cluster environment using Ganglia • Experienced on working with Big Data, Spark, Scala and Hadoop File System (HDFS). • Experienced on working with different Big Data variants like on Cloud (AWS, Azure), or on premises (Native, Cloudera and Hortonworks) • Strong knowledge of Hadoop and Hive and Hive’s analytical functions. • If you have similar experience i.e. you have experience in above areas but using different toolsets, that's fine as well.
Qualifications:
Bachelor Degree (preferred in any Science) or Bachelor degree in Computer Science or Computer Applications (BCA) or A Levels with Computing with some Certificate or Diploma courses in Programming Languages or Databases designing or relevant to above required skills)
Duties:
• To develop on Big Data technologies as per project delivery requirements. • Developing data ingestions and data structuring for optimised storage and bulk processing of data. • Importing and exporting data into HDFS and Hive using Sqoop and other similar tools and utilities e.g., Microsoft Azure Data Factory etc. • Load and transform large sets of structured, semi structured and unstructured data sets. • To work with Business Analysts, Technical Architects & Solution Designers to understand the detail of the designed solutions and develop the requirements using different programming languages and tools. • To liaise with Project Manager and PMO team and follow the task timelines and report progress and delays. • Highlight any risk and issues to development completion timelines to PMO team. • To work with co-team members and developers and support System Integration Testing. • Unit Testing the Code fixing the defects identified during SIT and UAT and other testing phases.