This site uses cookies. To find out more, see our Cookies Policy

Hadoop/Big Data Developer | $121k | Brentwood, TN in Brentwood, TN at Vaco

Date Posted: 5/21/2018

Job Snapshot

Job Description

A successful candidate will have:

  • Bachelor's degree in Computer Science with at least 7 years of IT work experience
  • Strong understanding of best practices and standards for Hadoop application design and implementation.
  • 2 Years of hands-on experience with Cloudera Distributed Hadoop (CDH) and experience with many of the following components:
    • Hadoop, MapReduce, Spark, Impala, Hive, Solr, YARN
    • HBase or Cassandra
    • Kafka, Flume, Storm, Zookeeper
    • Java, Python, or Scala
    • SQL, JSON, XML
    • RegEx
    • Sqoop
  • Experience in building machine learning applications, and broad knowledge of machine learning APIs, tools, and open source libraries
  • Experience with Unstructured Data
  • Data modeling experience using Big Data Technologies.
  • Experience in developing MapReduce programs using Apache Hadoop for working with Big Data.
  • Experience having deployed Big Data Technologies to Production.
  • Understanding of Lambda Design Architectures and Real-Time Streaming
  • Ability to multitask and to balance competing priorities.
  • Requires strong practical experience in agile application development, file systems management, and DevOps discipline and practice using short-cycle iterations to deliver continuous business value.
  • Expertise in planning, implementing, supporting, and tuning Hadoop ecosystem environments using a variety of tools and techniques.
  • Knowledge of all facets of Hadoop ecosystem development including ideation, design, implementation, tuning, and operational support.
  • Ability to define and utilize best practice techniques and to impose order in a fast-changing environment. Must have strong problem-solving skills.
  • Strong verbal, written, and interpersonal skills, including a desire to work within a highly-matrixed, team-oriented environment.

A successful candidate may have:

  • Experience in Healthcare Domain/Technologies such as HL7, and HIE
  • Experience in Patient Data
  • Experience with Predictive Models
  • Experience with Natural Language Processing (NLP)
  • Experience with Social Media Data

Hardware/Operating Systems:

  • Linux
  • UNIX
  • Distributed, highly-scalable processing environments
  • Networking - basic understanding of networking with respect to distributed server and file systems connectivity and troubleshooting of connectivity errors

Databases/Tools:

  • RDBMS - Teradata
  • NoSQL, Hbase, Cassandra, MongoDB, In-memory, Columnar, other emerging technologies
  • Other Languages - Java, Python, Scala, R
  • Build Systems - Maven, Ant
  • Source Control Systems - Git, Mercurial
  • Continuous Integration Systems - Jenkins or Bamboo
  • Config/Orchestration - Zookeeper, Puppet, Salt, Ansible, Chef, Oozie, Pig
  • Ability to integrate tools outside of the core Hadoop ecosystem

Certifications (a plus, but not required):

  • CCDH (Cloudera Certified Developer for Apache Hadoop)

Job Requirements

A successful candidate will have:

Bachelor’s degree in Computer Science with at least 7 years of IT work experience
Strong understanding of best practices and standards for Hadoop application design and implementation.
2 Years of hands-on experience with Cloudera Distributed Hadoop (CDH) and experience with many of the following components:
Hadoop, MapReduce, Spark, Impala, Hive, Solr, YARN
HBase or Cassandra
Kafka, Flume, Storm, Zookeeper
Java, Python, or Scala
SQL, JSON, XML
RegEx
Sqoop
Experience in building machine learning applications, and broad knowledge of machine learning APIs, tools, and open source libraries
Experience with Unstructured Data
Data modeling experience using Big Data Technologies.
Experience in developing MapReduce programs using Apache Hadoop for working with Big Data.
Experience having deployed Big Data Technologies to Production.
Understanding of Lambda Design Architectures and Real-Time Streaming
Ability to multitask and to balance competing priorities.
Requires strong practical experience in agile application development, file systems management, and DevOps discipline and practice using short-cycle iterations to deliver continuous business value.
Expertise in planning, implementing, supporting, and tuning Hadoop ecosystem environments using a variety of tools and techniques.
Knowledge of all facets of Hadoop ecosystem development including ideation, design, implementation, tuning, and operational support.
Ability to define and utilize best practice techniques and to impose order in a fast-changing environment. Must have strong problem-solving skills.
Strong verbal, written, and interpersonal skills, including a desire to work within a highly-matrixed, team-oriented environment.
A successful candidate may have:

Experience in Healthcare Domain/Technologies such as HL7, and HIE
Experience in Patient Data
Experience with Predictive Models
Experience with Natural Language Processing (NLP)
Experience with Social Media Data


Hardware/Operating Systems:

Linux
UNIX
Distributed, highly-scalable processing environments
Networking - basic understanding of networking with respect to distributed server and file systems connectivity and troubleshooting of connectivity errors
Databases/Tools:

RDBMS – Teradata
NoSQL, Hbase, Cassandra, MongoDB, In-memory, Columnar, other emerging technologies
Other Languages – Java, Python, Scala, R
Build Systems – Maven, Ant
Source Control Systems – Git, Mercurial
Continuous Integration Systems – Jenkins or Bamboo
Config/Orchestration – Zookeeper, Puppet, Salt, Ansible, Chef, Oozie, Pig
Ability to integrate tools outside of the core Hadoop ecosystem
Certifications (a plus, but not required):

CCDH (Cloudera Certified Developer for Apache Hadoop)