Data Engineer
Closing Date : 31 December 2023
(Associate up to 5 years / Consultant 5-9 years / Senior Consultant 10 years & above)
Responsibilities
- Finding trends in data sets and developing algorithms to help make raw data more useful to the
- Deep knowledge of SQL database design and multiple programming languages
- Perform data extraction, cleaning, transformation, and flow. Web scraping may be also a part of the work scope in data extraction.
- Design, build, launch and maintain efficient and reliable large-scale batch and real-time data pipelines with data processing frameworks
- Support and advise best practices in data management standards, policies and procedures
- Work closely with fellow developers through pair programming and code review process
Requirements
- Bachelors or Masters Degree in Statistics, Computer Science, Engineering, Analytics, Applied Mathematics, Operation Research, or equivalent quantitative fields preferred.
- Strong experience with in design, development, architecture, administration and implementation of data intensive applications
- Minimum of 5 years’ experience in big data and architectural experience in choosing and designing right Big Data solutions and performance tuning (MapReduce, Hive, Spark, Impala, HDFS, YARN, Oozie)
- In-depth experience in installing, configuring, managing, capacity planning and administering of Hadoop clusters
- Developed end-to-end dataflow using big data frameworks like Apache nifi, sqoop, flume, kafka, Spark, Hive, Python, Cassandra, RabbitMQ etc.
- Developed ETL pipeline and batch Experience in designing & development of ETL processes using ETL tools
- Working Knowledge of Data Quality and Profiling
- Hands on with processing of the large amount of data, Big Data and advanced
- Should have knowledge of data profiles, data quality and data
- Able to perform data analysis and trend analysis and working knowledge of ETL tools,
- Databases and experience in writing complex queries are
- Resolving data issues related to different customer journeys and be able to stitch the data
- across systems including on-premises and cloud
- Good have knowledge of Data Warehouse architectures, Data Modelling – Star Schema, Facts
- and Dimension design and
- Exposure to QlikView or any other reporting tool is added
- Hands on with Spark / Python / Scala to process large scale
- Proficient in database design and various databases (e.g. mongoDB, postgres/gis, MySQL, sqlite, voltdb, cassandra, etc)
- Working knowledge of Data governance, backup and Disaster recovery using various tools such as Atlas, falcon, DPS (hortonworks) and Cloudera sync
- Strong knowledge of cloud computing on AWS/Azure/Google Cloud