Lyft Data Science Assignment
算法代写 | R代写 | python代写 | data science代写 | 大数据代写 | 机器学习代写 | 人工智能代写 | 数据挖掘代写 | assignment代做 – 这个是一个data science代写, 涉及了R代写 | python代写 | data science代写 | 大数据代写 | 机器学习代写 | 人工智能代写 | 数据挖掘代写 | assignment代做等方面的内容,对关键算法和内容都有涉及
Thank you for taking the time to complete Lyfts Data Science
Assignment!
Assignment Lyft ridesharing is a two-sided marketplace with drivers and passengers. Every day new drivers join the platform and existing drivers either drive or they do not. Suppose you are working as a Data Scientist on the Driver Retention team whose primary goal is to reduce the rate of churn of activated drivers (a driver becomes activated once they complete their first ride). The team would like to understand churn better. Explore the data to provide the team with a deeper understanding of churn at Lyft. Your summary should include: The definition (with justification) for a driver to be considered churned. An assessment on the current business impact of churn to Lyft. Insights on factors affecting churn. Insights on segments of drivers more likely to churn. Next, the team would like to size the opportunity of reducing churn in order to prioritize their roadmap. The team is considering the following two hypotheses: i. Doubling the number of rides in an activated drivers first week. ii. Another hypothesis you recommend. Using the data, help the team prioritize these two hypotheses. You should cover: How big the opportunities are. What might be the longer-term consequences on the marketplace of each hypothesis. Which segments of drivers are most likely affected by each hypothesis. Which hypothesis you have more confidence in. Finally, suppose the team wants to test the following hypothesis: eliminating the Prime Time
feature will decrease driver churn . Design an experiment to do so. Your design should include:
How you will divide observational units into control and treatment, and a description of
the treatment and control conditions.
What are some potential second-order effects on the experience of drivers and
passengers during this experiment.
What are the primary and secondary metrics you will track.
How long you will run the experiment and how you will choose the winning variant.
Submission Instructions
- Please do not write your name on any submission documents.
- Using the data provided, aim to spend roughly 5-8 hours answering the questions.
- Prepare a 20 minute presentation for a panel of Data Scientists. At Lyft, we believe Data Scientists are most effective when they’re telling a story with data . Typically slides are most effective but you are welcome to use other formats (e.g. iPython-markdown, R-markdown, Word doc but you will need to .pdf them) if you prefer.
- Include all of your working materials (including all code) in a separate PDF.
- Keep in mind that we will be grading the assignment based on its technical soundness and depth, business applications and insights, structure and organization, completeness and polish . Data Provided data/driver_ids.csv driver_id Unique identifier for a driver driver_onboard_date Date on which driver was onboarded data/ride_ids.csv driver_id Unique identifier for a driver ride_id Unique identifier for a ride that was completed by the driver ride_distance Ride distance in meters ride_duration Ride durations in seconds ride_prime_time PrimeTime applied on the ride data/ride_timestamps.csv ride_id Unique identifier for a ride that was completed by the driver ride_picked_up_at Timestamp for when driver picked up the passenger