|All available jobs||>||Jobs in United States||>||Remote Software Engineer (Research)||>||Apply to this job|
|Summary:||As a software engineer in the Research team of our company, you will work with researchers, engineers, designers, and volunteers to write code that brings the research ideas and outputs into life.|
|Location:||San Francisco, CA|
|May telecommute:||Yes (May work remotely)|
|Description:||Every month, more than 70,000 volunteer editors come together to make free knowledge a reality.|
Every hour, 36,000 people view Wikipedia articles from all over the world. Every second, more than 100,000 HTTP requests hit Wikimedia’s servers.
In the Wikimedia Foundation’s Research team, we leverage this large-scale data in collaboration with Wikimedia volunteer communities to design new technologies and produce empirical insights in the service of the organization and the Wikimedia Movement on its path to knowledge equity and knowledge as a service.
You will support researchers by your code to gather data, run computationally intensive jobs, and eventually to make the outputs of the research available through public APIs, data-sets, and applications.
We are looking for a software engineer to join our team who is strongly committed to the principles of open source, transparency, privacy, and collaboration; a strong communicator (both orally and in written); someone who is self-motivated, proactive and can navigate smoothly in ambiguity; one who is eager to be part of a multi-disciplinary and diverse team at the service of free knowledge and wants to learn.
If you see yourself in the above, please read on and apply!
As a software engineer with our team, you will:
- Collaborate with researchers and engineers to design and expose models, algorithms and machine learning systems through APIs, data-sets, and web applications.
- Design and implement data collection and annotation efforts in collaboration with researchers and volunteer community members.
- Design and optimize computationally intensive data processing jobs.
- Design, develop, test, and deploy new features, improvements and upgrades to the software that supports research.
- Develop prototypes of new applications that incorporate research findings and ideas.
- Act as the Research team’s advocate in the Wikimedia engineering ecosystem and collaborate with teams such as Services, Analytics, Site Reliability Engineering, Security, Machine Learning Infrastructure, as well as Product to productionize research outputs.
- Discuss, document and share the process and results of your work publicly; engage with our communities at technical events, conferences and hackathons.
- Find creative solutions and write code that reflects the high standards of privacy in Wikimedia.
- Actively engage in a collaborative, consensus-oriented environment and as part of a globally-distributed organization.
Skills and Experience:
- BS, MS, or PhD in Computer Science, Mathematics, Statistics, or a closely related engineering field; or equivalent work experience
- Experience with database technologies: MySQL/Postgres or similar
- Experience developing RESTful APIs for data retrieval
- Strong understanding of Computer Science fundamentals, such as algorithms, data structures and complexity
- Knowledge of data analysis and the basics of statistics
- Experience with Hadoop and related technologies: HDFS, YARN, MapReduce, Hive, Spark, etc.
- Experience of distributed computing in modern platforms such as Apache Spark.
- Familiarity with NoSQL databases such as Cassandra or MongoDB
- Strong communication skills, including the ability to communicate complex technical issues to a cross-team and cross-functional audience
Additionally, we’d love it if you have:
- A portfolio of open source programming projects
- Relevant work experience with/in applied research teams
- Experience with open source machine learning libraries such as scikit-learn and deep learning frameworks such as Keras, TensorFlow or Pytorch
- Experience as a “data wrangler”, cleaning up and formatting semi-structured or unstructured data
- Experience in label collection using crowdsourcing platforms or large-scale systems
- Production-level experience with Hadoop, Spark, Flink, Hive, Kafka, etc.
- Familiarity with scientific computing libraries in Python
- Experience working with volunteers
- Experience editing Wikipedia or other Wikimedia or open data / knowledge projects
Show us your stuff!
Please provide us with information you feel would be useful to us in gaining a better understanding of your technical background and accomplishments. Links to GitHub, your technical blogs, publications, presentations, personal projects, etc. are exceptionally useful. We especially appreciate pointers to your best contributions to open source projects.
We do value writing and we do read your cover letter. Please introduce yourself to us, tell us why you are applying for this position, and what you’re looking for through this opportunity.