The Data Engineer II will independently support data operations functions, pipelines and architectures. Through your understanding of data content and information architecture, they will design new and make changes to existing data workflows. You will design ETL, machine learning and artificial intelligence algorithms, data flows and monitor/tune our data environment for optimal performance. You will also participate in demonstrating value in our data assets by collaborating with analysts, architects and other stakeholders to develop business cases, plans and cost estimates needed to demonstrate the value of our initiatives.
Ensure compliance with company policies, procedures, and Federal/State regulations
Navigate Microsoft Office Software, computer applications, and software specific to the Information Systems department in order to maximized technology tools and gain efficiency
Design, build and optimize managed data pipelines from operational data stores to points of consumption such as an enterprise data warehouse, data mart, or end user technologies
Use innovative tools, techniques and architectures to automate most common, repeatable tasks to maximize data quality and productivity
Collaborate across teams to work with stakeholders to ensure that data requirements are appropriately documented, refined and satisfied by the team
Apply and use modern data preparation, integration and AI-enabled metadata management tools and techniques
Track data consumption patterns to proactively identify new requirements or refinements, and report against preset baselines
Use, understand and gain experience with artificial intelligence (AI), machine learning (ML) and predictive analytics to help drive customer experiences
Perform optimization of data environments using techniques like intelligent sampling and caching
Manage logical and physical data models in all their forms, including conceptual models, relational database designs, message models and others
Implement data architectures that can identify, prioritize and execute the data and analytic initiatives focused on defined enterprise strategies and business outcomes
Follow data and analytics security requirements and solutions, and work with management to manage risks and ensure confidentiality, integrity and availability of enterprise data and analytics assets
Follow defining modeling standards, guidelines, best practices and approved modeling techniques
Assess impact of schema changes across the managed databases to ensure data quality and that resources and aggregations remain accurate
Participate with data and analytics leaders, and business and IT leadership in developing and following information governance processes and structures
Handle other projects on request
Focus on value creation using the organization's data assets, as well as the external data ecosystem to maximize value derived from data and analytics
Bachelor’s Degree in Computer Science, Statistics, Applied Mathematics, Data Management, or Information Science or related quantitative subject
Master’s Degree in Computer Science, Statistics, Applied Mathematics, Data Management, or Information Science or related quantitative subject
Ability to design, manage, and implement simple data flows and information architectures for financial institutions
Knowledge of analytics tools for Object-oriented/object function scripting using languages such as R, Python, Java, and C++.
Ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas, metadata and workload management. The ability to work with both IT and business in integrating analytics and data science output into business processes and workflows
Skill with popular database programming languages including T-SQL for relational databases and certifications on upcoming NoSQL/Hadoop oriented databases like MongoDB, Cassandra, others for non relational databases
Work with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies using APIs, SQL Server SSIS, etc
Familiarity with SQL on Hadoop tools and technologies including HIVE, Impala, Presto, Hortonworks Data Flow (HDF), Dremio, Informatica, Talend and others
Ability to work with and optimizing existing ETL processes and data integration and data preparation flows and helping to move them in production
Ability to work with both message queuing technologies such as Kafka, JMS, Azure Service Bus, and stream data integration technologies such as Apache Nifi, Apache Beam, Apache Kafka Streams, Amazon Kinesis and stream analytics technologies such as Apache Kafka KSQL, Spark Streaming, Samza and others
Knowledge of popular data discovery, analytics and BI software tools like Microsoft PowerBI for semantic-layer-based data discovery
Ability to working with data science teams in refining and optimizing data science and machine learning models and algorithms
The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.
OpenArc is a technology consulting firm providing industry-leading technical talent placement, software development, and technology strategy services to clients nationwide. Through a unique blending of people and software, OpenArc has a business practice that delivers amazing enterprise, mobile and consumer-facing apps and the best talent for contract, contract-to-hire and direct placements for clients and partners alike.
Staffed with the most-trusted recruiting experts, elite software developers, UI/UX designers and market experts, our team provides clients with the best resources, the right techniques and world-class support resulting in powerful measurable success.