Flexion Inc, a Madison, WI based IT Consulting and Services firm is looking for a Cloud Data Engineer to work in a contract to hire role. The client for this position is based in Boston, MA but work can be done remotely. Preference is for candidates in the Eastern or Central time zones.
The Cloud Data Engineer is a specialized role participating in designing and implementing systems on Public Cloud infrastructure to deliver more analytical and business value from a wide range of data sources. You will work with the team to design and develop high-performance, resilient, automated data pipelines, streams, and applications, adapting technologies for ingesting, transforming, classifying, cleansing and exposing data using creative design to meet objectives. Your skills and education in data management technologies will enable you to match the right technologies to the required schemas and workloads. Our focus is on the AWS and GCP platforms, with a strong serverless bias. We rely heavily on Python, PySpark, BigQuery and related technologies, and work in an Agile, DevOps team culture. We expect you to bring an array of specialized skills noted below, and to come prepared to learn rapidly to build on the foundation of your basic skills and education in this field.
Required Experience and Skills:
- Build and Maintain serverless data pipelines in terabyte scale using AWS and GCP services - AWS Glue, PySpark and Python, AWS Redshift, AWS S3, AWS Lambda and Step Functions, AWS Athena, AWS DynamoDB, GCP BigQuery, GCP Cloud Composer, GCP Cloud Functions, Google Cloud Storage and others
- Integrate new data sources from enterprise sources and external vendors using a variety of ingestion patterns including streams, SQL ingestion, file and API.
- Maintain and provide support for the existing data pipelines using the above-noted technologies
- Work to develop and enhance the data architecture of the new environment, including recommending optimal schemas, storage layers and database engines including relational, graph, columnar, and document-based, according to requirements
- Develop real-time/near real-time data ingestion from a range of data integration sources, including business systems, external vendors and partner and enterprise sources
- Provision and use machine-learning-based data wrangling tools like Trifacta to cleanse and reshape 3rd party data to make suitable for use.
- Participate in a DevOps culture by developing deployment code for applications and pipeline services
- Develop and implement data quality rules and logic across integrated data sources.
- Serve as internal subject matter expert and coach to train team members in the use of distributed computing frameworks and big-data services and tools, including AWS and GCP services and projects
(Experience is expected to be hands-on work, and formal education)
- Bachelor's degree in Computer Science, Mathematics, Engineering, or equivalent work experience
- Some exposure to working with datasets with very high volume of records or objects
- 2-3 years of programming experience in Python and SQL
- One year working with Spark or other distributed computing frameworks (may include: Hadoop, Cloudera)
- Two years with relational databases (typical examples include: PostgreSQL Microsoft SQL Server, MySQL, Oracle)
- Some exposure to AWS services including S3, Lambda, one or more AWS database technologies including Redshift, DynamoDB or Athena
- Some exposure to AWS services: DynamoDB, Step Functions
- Experience with contemporary data file formats like Apache Parquet and Avro, preferably with compression codecs, like Snappy and BZip.
- Experience analyzing data for data quality and supporting the use of data in an enterprise setting.
Equal Employment Opportunity/Affirmative Action EmployerJob Location
6000 American Parkway, Boston, MAPosition Type
If you require a reasonable accommodation to complete any part of the application process, or are limited in the ability or unable to access or use this online application process and need an alternative method for applying, you may contact us at 608-205-8868 for assistance.
This job has expired.