Python Data Engineer
Parse.ly is a real-time content measurement layer for the entire web.
Our analytics platform helps digital storytellers at some of the web's best sites, such as Arstechnica, The New Yorker, TechCrunch, Wired, The Intercept, Slate, and many more. In total, our analytics system handles over 65 billion monthly events from over 1 billion monthly unique visitors.
On the open source front, we maintain streamparse, the most widely used Python binding for the Apache Storm streaming data system. We also maintain pykafka, the most performant and Pythonic binding for Apache Kafka.
Our colleagues are talented: our UX/design team has built one of the best-looking dashboards on the planet, using Vue and D3.js, and our infrastructure engineers have built a scalable, devops-friendly cloud environment.
As a Python Data Engineer, you will help us expand our reach into the area of petabyte-scale data analysis — while ensuring consistent uptime, provable reliability, and top-rated performance of our backend streaming data systems.
We’re the kind of team that does “whatever it takes” to get a project done.
Parse.ly’s data engineering team already makes use of modern technologies like Python, Storm, Spark, Kafka, and Elasticsearch to analyze large datasets. As a Python Data Engineer at Parse.ly, you will be expected to master these technologies, while also being able to write code against them in Python, and debug issues down to the native C code and native JVM code layers, as necessary.
This team owns a real-time analytics infrastructure that processes over 2 million pageviews per minute from over 2,000 high-traffic sites. It operates a fleet of cloud servers that include thousands of cores of live data processing. We have written publicly about mage, our time series analytics engine. This will give you an idea about the kinds of systems we work on.
What you'll do
For this role, you should already be a proficient Python programmer who wants to work with data at scale.
In the role, you’ll...
- Write Python code using the best practices. See The Elements of Python Style, written by our CPO, for an example of our approach to code readability and design.
- Analyze data at massive scale. You need to be comfortable with the idea of your code running across 3,000 Python cores, thanks to process-level parallelization.
- Brainstorm new product ideas and directions with team and customers. You need to be a good communicator, especially in written form.
- Master cloud technologies and systems. You should love UNIX and be able to reason about distributed systems.
Our distributed team is best-in-class and we happily skip commutes by working out of our ergonomic home offices. Here's a photograph of our founding CTO's setup running two full-screen Parse.ly dashboards.
- Work from home or anywhere else in our industry-leading distributed team.
- Earn a competitive salary and benefits (health/dental/401k).
- Splurge with a generous equipment budget.
- Work with one of the brightest teams in tech.
- Speak at and attend conferences like PyData on Parse.ly's dime.
- Python for both backend and frontend.
- Amazon Web Services used for most systems.
- Modern databases like ElasticSearch, Redis, Cassandra, and Postgres.
- Frameworks like Django, Tornado and the PyData stack (e.g. Pandas).
- Running Kafka, Storm, Spark in production atop massive data sets.
- Easy system management with Fabric and Terraform.
Fully distributed team
Parse.ly is a fully distributed team, with engineers working from across the world. People with past experience working remotely will be prioritized. US/Eastern timezones will be prioritized.
A Note About Diversity & Inclusion at Parse.ly
We’re improving diversity in the tech industry. At Parse.ly and our parent company, Automattic, we want people to love their work and show respect and empathy to all. We welcome differences and strive to increase participation from traditionally underrepresented groups.
Studies have shown that women and people of color tend not to apply for jobs unless they feel they meet exactly all items in the job description. We believe that high-performing teams include people from different backgrounds and experiences who can challenge each other’s assumptions with fresh perspectives. We encourage you to apply, even if you don’t match everything listed in the job description. Let us know why this role appeals to you, why you will be great, and which requirements will stretch you.
We are an equal opportunity employer who values and encourages diversity and belonging at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
- Apply now by sending your CV and/or LinkedIn profile, Github link (or similar) to email@example.com. Make sure to indicate you are applying for the "Python Data Engineer" role.
- Include a 1-3 paragraph intro to why you're interested in this role.
- Tell us about an interesting project you worked on and point us toward a piece of code you wrote.