There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
the atmospheric conditions that comprise the state of the atmosphere in terms of temperature and wind and clouds and precipitation
If we want a more sophisticated system, we can build another model for traffic congestion and yet another model to forecast weather conditions and their effect on the safest maximum speed.
a unit of information equal to 1000 terabytes or 10^15 bytes
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
Note here the different levels: models of individual components, tied together in a simulation given a set of inputs, iterated through over different input sets in a search optimizer.”
computer architecture in which processors are connected in a manner suggestive of connections between neurons; can learn by trial and error
Industrial engineers were among the first to begin using neural networks, applying them to problems like the optimal design of assembly lines and quality control.
the aggregation of vehicles coming and going in a particular locality
For motor vehicle traffic, IBM performed a project with the city of Stockholm to optimize traffic flows that reduced congestion by nearly a quarter, and increased the air quality in the inner city by 25%.
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
a precise rule specifying how to solve some problem
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
the benefits lost by choosing one option over another
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
a person who is cultured and has worldly experience
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
bring to a desired consistency by heating and cooling
Optimization is a process we are all familiar with in our daily lives, even if we have never used algorithms like gradient descent or simulated annealing.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
praise of a person or thing as worthy or desirable
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
Zafu’s approach is not to send their customers directly to the clothes, but to begin by asking a series of simple questions about the customers’ body type, how well their other jeans fit, and their fashion preferences.
a short, boxed section of text accompanying the main text
There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
the act of imitating the behavior of some situation
Because the simulation is at a per-policy level, the insurer can view the impact of a given set of price changes on revenue, market share, and other metrics over time.
a representation of something, often on a smaller scale
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
German mathematician who developed the theory of numbers and who applied mathematics to electricity and magnetism and astronomy and geodesy (1777-1855)
Sidebar: Optimization in the real world
Optimization is a classic problem that has been studied by Newton and Gauss all the way up to mathematicians and engineers in the present day.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
Insurers have centuries of experience in prediction, but as recently as 10 years ago, the insurance companies often failed to make optimal business decisions about what price to charge each new customer.
one of the individual parts making up a larger entity
The first component of ODG’s Modeler was a model of price elasticity (the probability that a customer will accept a given price) for new policies and for renewals.
a concession made by a labor union to a company that is trying to lower its expenditures
The takeaway, whether you are a tiny startup or a giant insurance company, is that we unconsciously use optimization whenever we decide how to get to where we want to go.
Engineers are often quietly on the leading edge of algorithmic applications because they have long been thinking about their own modeling challenges in an objective-based way.
a relatively low hill on the lower slope of a mountain
The danger in this hill-climbing approach is that if the steps are too small, we may get stuck at one of the many local maxima in the foothills, which will not tell us the best set of controllable inputs.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
Optimization is a process we are all familiar with in our daily lives, even if we have never used algorithms like gradient descent or simulated annealing.
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
The danger in this hill-climbing approach is that if the steps are too small, we may get stuck at one of the many local maxima in the foothills, which will not tell us the best set of controllable inputs.
the older and more populated and (usually) poorer central section of a city
For motor vehicle traffic, IBM performed a project with the city of Stockholm to optimize traffic flows that reduced congestion by nearly a quarter, and increased the air quality in the inner city by 25%.
a screen-oriented interactive program enabling a user to lay out financial data on the screen
It is easy to stumble into the trap of thinking that since data exists somewhere abstract, on a spreadsheet or in the cloud, that data products are just abstract algorithms.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
We can keep the “like” model that we have already built as well as the causality model for purchases with and without recommendations, and then take a staged approach to adding additional models that we think will improve the marketing effectiveness.
a self-propelled wheeled vehicle that does not run on rails
For motor vehicle traffic, IBM performed a project with the city of Stockholm to optimize traffic flows that reduced congestion by nearly a quarter, and increased the air quality in the inner city by 25%.
A company like Amazon represents every purchase that has ever been made as a giant sparse matrix, with customers as the rows and products as the columns.
maintenance of standards of quality of manufactured goods
Industrial engineers were among the first to begin using neural networks, applying them to problems like the optimal design of assembly lines and quality control.
reproduced or made to resemble; imitative in character
Optimization is a process we are all familiar with in our daily lives, even if we have never used algorithms like gradient descent or simulated annealing.
When designing a product or manufacturing process, a drivetrain-like process followed by model integration, simulation and optimization is a familiar part of the toolkit of systems engineers .
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
the study of poetic meter and the art of versification
Because the simulation is at a per-policy level, the insurer can view the impact of a given set of price changes on revenue, market share, and other metrics over time.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
levers the insurance company could control: what price to charge each customer, what types of accidents to cover, how much to spend on marketing and customer service, and how to react to their competitors’ pricing decisions.
One of the authors of this paper was explaining an iterative optimization technique, and the host says, “So, in a sense Jeremy, your approach was like that of doing a startup, which is just get something out there and iterate and iterate and iterate.”
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
The danger in this hill-climbing approach is that if the steps are too small, we may get stuck at one of the many local maxima in the foothills, which will not tell us the best set of controllable inputs.
Optimization is a process we are all familiar with in our daily lives, even if we have never used algorithms like gradient descent or simulated annealing.
advertisement that offers something free to arouse interest
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
an enclosure within which something originates or develops
A company like Amazon represents every purchase that has ever been made as a giant sparse matrix, with customers as the rows and products as the columns.
Industrial engineers were among the first to begin using neural networks, applying them to problems like the optimal design of assembly lines and quality control.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
a region in California to the south of San Francisco that is noted for its concentration of high-technology industries
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
made for or directed or adjusted to a particular individual
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
the tendency of a body to return to its original shape
The first component of ODG’s Modeler was a model of price elasticity (the probability that a customer will accept a given price) for new policies and for renewals.
As predictive modeling and optimization become more vital to a wide variety of activities, look out for the engineers to disrupt industries that wouldn’t immediately appear to be in the data business.
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies.
We don’t know what design approaches will be developed in the future, but right now, there is a need for the data science community to coalesce around a shared vocabulary and product design process that can be used to educate others on how to derive value from their predictive models.
Plenty of websites sell designer denim, but for many women, high-end jeans are the one item of clothing they never buy online because it’s hard to find the right pair without trying them on.
If we want a more sophisticated system, we can build another model for traffic congestion and yet another model to forecast weather conditions and their effect on the safest maximum speed.
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
a visual representation of the relations between quantities
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
a practical method or art applied to some particular task
There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
When designing a product or manufacturing process, a drivetrain-like process followed by model integration, simulation and optimization is a familiar part of the toolkit of systems engineers .
a widely used search engine that uses text-matching techniques to find web pages that are important and relevant to a user's search
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
the commercial processes in promoting and selling something
]
ODG identified which
levers the insurance company could control: what price to charge each customer, what types of accidents to cover, how much to spend on marketing and customer service, and how to react to their competitors’ pricing decisions.
models we will need, such as physics models to predict the effects of steering, braking and acceleration, and pattern recognition algorithms to interpret data from the road signs.
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
a surface forming a common boundary between two things
These disaster applications are a particularly good example of why data products need simple, well-designed interfaces that produce concrete recommendations.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
levers the insurance company could control: what price to charge each customer, what types of accidents to cover, how much to spend on marketing and customer service, and how to react to their competitors’ pricing decisions.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
Data scientists now have the predictive tools to build products that increase the common good, but they need to be aware that building the models is not enough if they do not also produce optimized, implementable outcomes.
levers the insurance company could control: what price to charge each customer, what types of accidents to cover, how much to spend on marketing and customer service, and how to react to their competitors’ pricing decisions.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
The models will take both the levers and any uncontrollable variables as their inputs; the outputs from the models can be combined to predict the final state for our objective.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
not of natural origin; prepared or made artificially
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
the state capital and largest city located in south central Arizona; situated in a former desert that has become a prosperous agricultural area thanks to irrigation
The screenshot below is taken from a model integration tool designed by Phoenix Integration .
the spatial property of the way in which something is placed
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
Plenty of websites sell designer denim, but for many women, high-end jeans are the one item of clothing they never buy online because it’s hard to find the right pair without trying them on.
utility consisting of the pipes and fixtures for the distribution of water or gas in a building and for the disposal of sewage
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
a navigational system involving satellites and computers that can determine the latitude and longitude of a receiver on Earth by computing the time difference for signals from different satellites to reach the receiver
Instead of the femme-bot voice of the GPS unit telling us which route to take and where to turn, what would it take to build a car that would make those decisions by itself?
These firms have plenty of experience building models of each of the components and systems in their final product, whether they’re building a server farm or a fighter jet.
Simulator to test the utility of each of the many possible books we have in stock, or perhaps just over all the outputs of a collaborative filtering model of similar customer purchases, and then build a simple
Optimizer that ranks and displays the recommended books based on their simulated utility.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
The Strand bookseller made a brilliant but far-fetched recommendation probably based more on the character of Morrison’s writing than superficial similarities between Morrison and other authors.
a measure of how likely it is that some event will occur
The first component of ODG’s Modeler was a model of price elasticity (the probability that a customer will accept a given price) for new policies and for renewals.
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
any nonverbal action or gesture that encodes a message
The self-driving car needs to take the next step: after
simulating all the possibilities, it must
optimize the results of the simulation to pick the best combination of acceleration and braking, steering and signaling, to get us safely to Santa Clara.
a digital audio file made available on the internet
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
For example, a pair of jeans that is often paired with a particular top, or the first part of a series of novels that often leads to a sale of the whole set.
Because the simulation is at a per-policy level, the insurer can view the impact of a given set of price changes on revenue, market share, and other metrics over time.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
ODG approached this problem with an early use of the Drivetrain Approach and a practical take on step 4 that can be applied to a wide range of problems.
These models predicted whether customers would renew their policies in one year, allowing for changes in price and willingness to jump to a competitor.
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
The models will take both the levers and any uncontrollable variables as their inputs; the outputs from the models can be combined to predict the final state for our objective.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
A company like Amazon represents every purchase that has ever been made as a giant sparse matrix, with customers as the rows and products as the columns.
The Strand bookseller made a brilliant but far-fetched recommendation probably based more on the character of Morrison’s writing than superficial similarities between Morrison and other authors.
mechanics concerned with forces that cause motions of bodies
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
The difference between these two probabilities is a utility function for a given recommendation to a customer (see Recommendation Engine figure, below).
We could construct a patience model for the customers’ tolerance for poorly targeted communications: When do they tune them out and filter our messages straight to spam?
someone who employs or takes advantage of something
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
What we would really like to do is emulate the experience of Mark Johnson, CEO of Zite , who gave a perfect example of what a customer’s recommendation experience should be like in a recent TOC talk .
examine so as to determine accuracy, quality, or condition
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
These disaster applications are a particularly good example of why data products need simple, well-designed interfaces that produce concrete recommendations.
the process of becoming cooler; a falling temperature
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
a distinct part that can be specified separately in a group
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
the act of scouting, especially to gain information
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
connected to a computer network or accessible by computer
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
a hollow globule of gas (e.g., air or carbon dioxide)
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
The difference between these two probabilities is a utility function for a given recommendation to a customer (see Recommendation Engine figure, below).
What is most important about these examples is that the engineers who designed these data products didn’t start by building a neato robot and then looking for something to do with it.
a commercial business that provides scheduled flights
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
an object consisting of a number of pages bound together
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
United States financier and philanthropist (1855-1937)
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
an item of information that is typical of a class or group
We begin by applying the Drivetrain Approach to a familiar example, recommendation engines, and then building this up into an entire optimized marketing strategy.
having or showing knowledge and skill and aptitude
As scientists and engineers become more adept at applying prediction and optimization to everyday problems, they are expanding the art of the possible, optimizing everything from our personal health to the houses and cities we live in.
The models will take both the levers and any uncontrollable variables as their inputs; the outputs from the models can be combined to predict the final state for our objective.
close-fitting trousers worn for manual work or casual wear
Plenty of websites sell designer denim, but for many women, high-end jeans are the one item of clothing they never buy online because it’s hard to find the right pair without trying them on.
The objective of a recommendation engine is to drive additional sales by surprising and delighting the customer with books he or she would not have purchased without the recommendation .
Engineers are often quietly on the leading edge of algorithmic applications because they have long been thinking about their own modeling challenges in an objective-based way.
But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies.
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
vehicles or pedestrians traveling in a particular locality
If we want a more sophisticated system, we can build another model for traffic congestion and yet another model to forecast weather conditions and their effect on the safest maximum speed.
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
As one engineer on the Google self-driving car project put it in a recent Wired article , “We’re analyzing and predicting the world 20 times a second.”
But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies.
She cut through the chaff of the obvious to make a recommendation that will send the customer home with a new book, and returning to Strand again and again in the future.
We can keep the “like” model that we have already built as well as the causality model for purchases with and without recommendations, and then take a staged approach to adding additional models that we think will improve the marketing effectiveness.
material consisting of seed coverings and pieces of stem
She cut through the chaff of the obvious to make a recommendation that will send the customer home with a new book, and returning to Strand again and again in the future.
For motor vehicle traffic, IBM performed a project with the city of Stockholm to optimize traffic flows that reduced congestion by nearly a quarter, and increased the air quality in the inner city by 25%.
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
United States lawyer and poet who wrote a poem after witnessing the British attack on Baltimore during the War of 1812; the poem was later set to music and entitled `The Star-Spangled Banner' (1779-1843)
There is a
Modeler for aerodynamics and mechanical structure that can then be fed to a
Simulator to produce the Key Wing Outputs of cost, weight, lift coefficient and induced drag.
As one engineer on the Google self-driving car project put it in a recent Wired article , “We’re analyzing and predicting the world 20 times a second.”
having a bearing on or connection with the subject at issue
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
It is easy to stumble into the trap of thinking that since data exists somewhere abstract, on a spreadsheet or in the cloud, that data products are just abstract algorithms.
Improving the data collection and predictive models is very important, but we want to emphasize the importance of beginning by defining a clear objective with levers that produce actionable outcomes.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
The models will take both the levers and any uncontrollable variables as their inputs; the outputs from the models can be combined to predict the final state for our objective.
a particular course of action intended to achieve a result
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
a branch of mathematics concerned with quantitative data
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
We can keep the “like” model that we have already built as well as the causality model for purchases with and without recommendations, and then take a staged approach to adding additional models that we think will improve the marketing effectiveness.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
It will be low in cases where the algorithm recommends a familiar book that the customer has already rejected (both components are small) or a book that he or she would have bought even without the recommendation (both components are large and cancel each other out).
Although it’s from a completely different engineering discipline, this diagram is very similar to the Drivetrain Approach we’ve recommended for data products.
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
willingness to respect the beliefs or practices of others
We could construct a patience model for the customers’ tolerance for poorly targeted communications: When do they tune them out and filter our messages straight to spam?
income (at invoice values) received for goods and services over some given period of time
The objective of a recommendation engine is to drive additional sales by surprising and delighting the customer with books he or she would not have purchased without the recommendation .
Zafu’s approach is not to send their customers directly to the clothes, but to begin by asking a series of simple questions about the customers’ body type, how well their other jeans fit, and their fashion preferences.
United States industrialist and philanthropist who endowed education and public libraries and research trusts (1835-1919)
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
characterized by action or forcefulness of personality
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
give knowledge acquired by learning and instruction
We don’t know what design approaches will be developed in the future, but right now, there is a need for the data science community to coalesce around a shared vocabulary and product design process that can be used to educate others on how to derive value from their predictive models.
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
It is easy to stumble into the trap of thinking that since data exists somewhere abstract, on a spreadsheet or in the cloud, that data products are just abstract algorithms.
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
The models will take both the levers and any uncontrollable variables as their inputs; the outputs from the models can be combined to predict the final state for our objective.
a partiality preventing objective consideration of an issue
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
the act of conducting a controlled test or investigation
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
a mechanism that can move independently of external control
Instead of the femme-bot voice of the GPS unit telling us which route to take and where to turn, what would it take to build a car that would make those decisions by itself?
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
We don’t know what design approaches will be developed in the future, but right now, there is a need for the data science community to coalesce around a shared vocabulary and product design process that can be used to educate others on how to derive value from their predictive models.
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
a set of pages on the internet organized as a single unit
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
We introduced the Drivetrain Approach to provide a framework for designing the next generation of great data products and described how it relies at its heart on optimization.
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
The first component of ODG’s Modeler was a model of price elasticity (the probability that a customer will accept a given price) for new policies and for renewals.
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
a succession of notes forming a distinctive sequence
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
the science of matter and energy and their interactions
We need to define the
models we will need, such as physics models to predict the effects of steering, braking and acceleration, and pattern recognition algorithms to interpret data from the road signs.
The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
a group of many things in the air or on the ground
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
connect, fasten, or put together two or more pieces
The third step was to consider what new
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
models we will need, such as physics models to predict the effects of steering, braking and acceleration, and pattern recognition algorithms to interpret data from the road signs.
As scientists and engineers become more adept at applying prediction and optimization to everyday problems, they are expanding the art of the possible, optimizing everything from our personal health to the houses and cities we live in.
These firms have plenty of experience building models of each of the components and systems in their final product, whether they’re building a server farm or a fighter jet.
There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
in a poor or improper or unsatisfactory manner; not well
We could construct a patience model for the customers’ tolerance for poorly targeted communications: When do they tune them out and filter our messages straight to spam?
the discipline that studies transmitting information
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
someone who creates plans to be used in making something
Plenty of websites sell designer denim, but for many women, high-end jeans are the one item of clothing they never buy online because it’s hard to find the right pair without trying them on.
ODG approached this problem with an early use of the Drivetrain Approach and a practical take on step 4 that can be applied to a wide range of problems.
These models predicted whether customers would renew their policies in one year, allowing for changes in price and willingness to jump to a competitor.
the state of affairs that a plan is intended to achieve
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
sturdy and strong in form, constitution, or construction
There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
We don’t know what design approaches will be developed in the future, but right now, there is a need for the data science community to coalesce around a shared vocabulary and product design process that can be used to educate others on how to derive value from their predictive models.
having few parts; not complex or complicated or involved
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
done by or characteristic of individuals acting together
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
This is not to say that Amazon’s recommendation engine could not have made the same connection; the problem is that this helpful recommendation will be buried far down in the recommendation feed, beneath books that have more obvious similarities to “Beloved.”
The Strand bookseller made a brilliant but far-fetched recommendation probably based more on the character of Morrison’s writing than superficial similarities between Morrison and other authors.
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
the corporate executive responsible for the operations of the firm; reports to a board of directors; may appoint other managers (including a president)
What we would really like to do is emulate the experience of Mark Johnson, CEO of Zite , who gave a perfect example of what a customer’s recommendation experience should be like in a recent TOC talk .
They can also explore how the distribution of profit is shaped by the inputs outside of the insurer’s control: “What if the economy crashes and the customer loses his job?
a branch of study or knowledge involving the observation, investigation, and discovery of general laws or truths that can be tested systematically
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
the content of observation or participation in an event
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
These models predicted whether customers would renew their policies in one year, allowing for changes in price and willingness to jump to a competitor.
the act of bringing things together to form a new whole
But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies.
discover or determine the existence, presence, or fact of
Next, we consider what
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
ODG approached this problem with an early use of the Drivetrain Approach and a practical take on step 4 that can be applied to a wide range of problems.
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
It will be low in cases where the algorithm recommends a familiar book that the customer has already rejected (both components are small) or a book that he or she would have bought even without the recommendation (both components are large and cancel each other out).
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
something intended to communicate a particular impression
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
a hard black form of lignite that takes a brilliant polish
These firms have plenty of experience building models of each of the components and systems in their final product, whether they’re building a server farm or a fighter jet.
a school teaching mechanical and industrial arts and the applied sciences
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
As scientists and engineers become more adept at applying prediction and optimization to everyday problems, they are expanding the art of the possible, optimizing everything from our personal health to the houses and cities we live in.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
Engineers are often quietly on the leading edge of algorithmic applications because they have long been thinking about their own modeling challenges in an objective-based way.
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
something that stands in the way and must be surmounted
Next, we consider what
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
Instead of the femme-bot voice of the GPS unit telling us which route to take and where to turn, what would it take to build a car that would make those decisions by itself?
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
As one engineer on the Google self-driving car project put it in a recent Wired article , “We’re analyzing and predicting the world 20 times a second.”
Plenty of websites sell designer denim, but for many women, high-end jeans are the one item of clothing they never buy online because it’s hard to find the right pair without trying them on.
But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies.
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
pecuniary reimbursement to the winning party for the expenses of litigation
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
come into being or existence, or appear on the scene
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
They can also explore how the distribution of profit is shaped by the inputs outside of the insurer’s control: “What if the economy crashes and the customer loses his job?
the speech act of continuing a conversational exchange
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
similar things placed in order or one after another
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
easily perceived by the senses or grasped by the mind
She cut through the chaff of the obvious to make a recommendation that will send the customer home with a new book, and returning to Strand again and again in the future.
Plenty of websites sell designer denim, but for many women, high-end jeans are the one item of clothing they never buy online because it’s hard to find the right pair without trying them on.
The first component of ODG’s Modeler was a model of price elasticity (the probability that a customer will accept a given price) for new policies and for renewals.
reach, make, or come to a conclusion about something
The takeaway, whether you are a tiny startup or a giant insurance company, is that we unconsciously use optimization whenever we decide how to get to where we want to go.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
The takeaway, whether you are a tiny startup or a giant insurance company, is that we unconsciously use optimization whenever we decide how to get to where we want to go.
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
We introduced the Drivetrain Approach to provide a framework for designing the next generation of great data products and described how it relies at its heart on optimization.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
having a substance added to increase effectiveness
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
the act of causing something to go (especially messages)
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
The takeaway, whether you are a tiny startup or a giant insurance company, is that we unconsciously use optimization whenever we decide how to get to where we want to go.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
These disaster applications are a particularly good example of why data products need simple, well-designed interfaces that produce concrete recommendations.
a process of becoming larger or longer or more numerous
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
of the immediate past or just previous to the present time
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
The danger in this hill-climbing approach is that if the steps are too small, we may get stuck at one of the many local maxima in the foothills, which will not tell us the best set of controllable inputs.
atmospheric conditions such as temperature and precipitation
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
come into the possession of something concrete or abstract
The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
As predictive modeling and optimization become more vital to a wide variety of activities, look out for the engineers to disrupt industries that wouldn’t immediately appear to be in the data business.
The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
the boundary line or area immediately inside the boundary
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
levers the insurance company could control: what price to charge each customer, what types of accidents to cover, how much to spend on marketing and customer service, and how to react to their competitors’ pricing decisions.
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
What matters is that using a Drivetrain Approach combined with a Model Assembly Line bridges the gap between predictive models and actionable outcomes.
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
restrained or managed or kept within certain bounds
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
She cut through the chaff of the obvious to make a recommendation that will send the customer home with a new book, and returning to Strand again and again in the future.
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
act of extending over a wider scope or expanse of space or time
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies.
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
She cut through the chaff of the obvious to make a recommendation that will send the customer home with a new book, and returning to Strand again and again in the future.
discover or determine the existence, presence, or fact of
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
ODG approached this problem with an early use of the Drivetrain Approach and a practical take on step 4 that can be applied to a wide range of problems.
There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
the context that influences the performance of a process
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
a device in which something can be caught and penned
It is easy to stumble into the trap of thinking that since data exists somewhere abstract, on a spreadsheet or in the cloud, that data products are just abstract algorithms.
We introduced the Drivetrain Approach to provide a framework for designing the next generation of great data products and described how it relies at its heart on optimization.
standardized procedure for measuring sensitivity or aptitude
We can build a
Simulator to test the utility of each of the many possible books we have in stock, or perhaps just over all the outputs of a collaborative filtering model of similar customer purchases, and then build a simple
Optimizer that ranks and displays the recommended books based on their simulated utility.
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
sequence of events involved in the development of a species
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
We could construct a patience model for the customers’ tolerance for poorly targeted communications: When do they tune them out and filter our messages straight to spam?
She cut through the chaff of the obvious to make a recommendation that will send the customer home with a new book, and returning to Strand again and again in the future.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
They can also explore how the distribution of profit is shaped by the inputs outside of the insurer’s control: “What if the economy crashes and the customer loses his job?
Optimization is a process we are all familiar with in our daily lives, even if we have never used algorithms like gradient descent or simulated annealing.
The Strand bookseller made a brilliant but far-fetched recommendation probably based more on the character of Morrison’s writing than superficial similarities between Morrison and other authors.
The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
Instead of the femme-bot voice of the GPS unit telling us which route to take and where to turn, what would it take to build a car that would make those decisions by itself?
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
Note here the different levels: models of individual components, tied together in a simulation given a set of inputs, iterated through over different input sets in a search optimizer.”
hanging cloth used as a blind (especially for a window)
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
The objective of a recommendation engine is to drive additional sales by surprising and delighting the customer with books he or she would not have purchased without the recommendation .
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
One of the authors of this paper was explaining an iterative optimization technique, and the host says, “So, in a sense Jeremy, your approach was like that of doing a startup, which is just get something out there and iterate and iterate and iterate.”
Insurers have centuries of experience in prediction, but as recently as 10 years ago, the insurance companies often failed to make optimal business decisions about what price to charge each new customer.
an essential and distinguishing attribute of something
Industrial engineers were among the first to begin using neural networks, applying them to problems like the optimal design of assembly lines and quality control.
There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
machine that creates mechanical energy and imparts movement
For motor vehicle traffic, IBM performed a project with the city of Stockholm to optimize traffic flows that reduced congestion by nearly a quarter, and increased the air quality in the inner city by 25%.
having or showing knowledge or understanding or realization
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
the principal activity in one's life to earn money
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
abstract separation of something into its various parts
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
We don’t know what design approaches will be developed in the future, but right now, there is a need for the data science community to coalesce around a shared vocabulary and product design process that can be used to educate others on how to derive value from their predictive models.
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
a mode of being or form of existence of a person or thing
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
try to locate or discover, or try to establish the existence of
What is most important about these examples is that the engineers who designed these data products didn’t start by building a neato robot and then looking for something to do with it.
It will be low in cases where the algorithm recommends a familiar book that the customer has already rejected (both components are small) or a book that he or she would have bought even without the recommendation (both components are large and cancel each other out).
Note here the different levels: models of individual components, tied together in a simulation given a set of inputs, iterated through over different input sets in a search optimizer.”
transfer possession of something concrete or abstract
The first component of ODG’s Modeler was a model of price elasticity (the probability that a customer will accept a given price) for new policies and for renewals.
a group of people living in a particular local area
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
The first component of ODG’s Modeler was a model of price elasticity (the probability that a customer will accept a given price) for new policies and for renewals.
Insurers have centuries of experience in prediction, but as recently as 10 years ago, the insurance companies often failed to make optimal business decisions about what price to charge each new customer.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
harmonious arrangement or relation of parts within a whole
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
the act of someone who picks up or takes something
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
a port in western California near the Golden Gate that is one of the major industrial and transportation centers; it has one of the world's finest harbors; site of the Golden Gate Bridge
Suppose we wanted to get from San Francisco to the Strata 2012 Conference in Santa Clara .
performing an essential function in the living body
As predictive modeling and optimization become more vital to a wide variety of activities, look out for the engineers to disrupt industries that wouldn’t immediately appear to be in the data business.
a relative position or degree of value in a graded group
Because the simulation is at a per-policy level, the insurer can view the impact of a given set of price changes on revenue, market share, and other metrics over time.
an open fabric woven together at regular intervals
Industrial engineers were among the first to begin using neural networks, applying them to problems like the optimal design of assembly lines and quality control.
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
come into the possession of something concrete or abstract
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
What is most important about these examples is that the engineers who designed these data products didn’t start by building a neato robot and then looking for something to do with it.
There are many techniques to avoid this problem, some based on statistics and spreading our bets widely, and others based on systems seen in nature, like biological evolution or the cooling of atoms in glass.
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
The danger in this hill-climbing approach is that if the steps are too small, we may get stuck at one of the many local maxima in the foothills, which will not tell us the best set of controllable inputs.
a system of rules of conduct or method of practice
Although it’s from a completely different engineering discipline, this diagram is very similar to the Drivetrain Approach we’ve recommended for data products.
ligament made of metal and used to fasten things or make cages or fences etc
As one engineer on the Google self-driving car project put it in a recent Wired article , “We’re analyzing and predicting the world 20 times a second.”
the line at which the sky and Earth appear to meet
They began by defining the
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
English industrialist who pioneered in the design and manufacture of aircraft (1885-1962)
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
deficient in quantity or number compared with the demand
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
Zafu’s approach is not to send their customers directly to the clothes, but to begin by asking a series of simple questions about the customers’ body type, how well their other jeans fit, and their fashion preferences.
the entire amount of income before any deductions are made
Because the simulation is at a per-policy level, the insurer can view the impact of a given set of price changes on revenue, market share, and other metrics over time.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
Plenty of websites sell designer denim, but for many women, high-end jeans are the one item of clothing they never buy online because it’s hard to find the right pair without trying them on.
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
For motor vehicle traffic, IBM performed a project with the city of Stockholm to optimize traffic flows that reduced congestion by nearly a quarter, and increased the air quality in the inner city by 25%.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
These models predicted whether customers would renew their policies in one year, allowing for changes in price and willingness to jump to a competitor.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
the act of contending with others for rewards or resources
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
A company like Amazon represents every purchase that has ever been made as a giant sparse matrix, with customers as the rows and products as the columns.
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
One of the authors of this paper was explaining an iterative optimization technique, and the host says, “So, in a sense Jeremy, your approach was like that of doing a startup, which is just get something out there and iterate and iterate and iterate.”
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
a large number of things or people considered together
Models developed to simulate fluid dynamics and turbulence have been applied to improving traffic and pedestrian flows by using the placement of exits and crowd control barriers as levers.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
For motor vehicle traffic, IBM performed a project with the city of Stockholm to optimize traffic flows that reduced congestion by nearly a quarter, and increased the air quality in the inner city by 25%.
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
a visible clue that something has happened or is present
Next, we consider what
data the car needs to collect; it needs sensors that gather data about the road as well as cameras that can detect road signs, red or green lights, and unexpected obstacles (including pedestrians).
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
What we would really like to do is emulate the experience of Mark Johnson, CEO of Zite , who gave a perfect example of what a customer’s recommendation experience should be like in a recent TOC talk .
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
find out, learn, or determine with certainty, usually by making an inquiry or other effort
Once we have these models, we construct a
Simulator and an
Optimizer and run them over the combined models to find out what recommendations will achieve our objectives: driving sales and improving the customer experience.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
Industrial engineers were among the first to begin using neural networks, applying them to problems like the optimal design of assembly lines and quality control.
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
The danger in this hill-climbing approach is that if the steps are too small, we may get stuck at one of the many local maxima in the foothills, which will not tell us the best set of controllable inputs.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
(computer science) a system of world-wide electronic communication in which a computer user can compose a message at one terminal that can be regenerated at the recipient's terminal when the recipient logs in
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
an open fabric of string or rope or wire woven together
They began by defining the
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
As scientists and engineers become more adept at applying prediction and optimization to everyday problems, they are expanding the art of the possible, optimizing everything from our personal health to the houses and cities we live in.
The self-driving car needs to take the next step: after
simulating all the possibilities, it must
optimize the results of the simulation to pick the best combination of acceleration and braking, steering and signaling, to get us safely to Santa Clara.
the cardinal number that is the product of 10 and 100
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
For example, a pair of jeans that is often paired with a particular top, or the first part of a series of novels that often leads to a sale of the whole set.
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
I think of this as a complicated machine (full-system) where the curtain is withdrawn and you get to model each significant part of the machine under controlled experiments and then simulate the interactions.
one of the portions into which something is regarded as divided and which together constitute a whole
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
We could construct a patience model for the customers’ tolerance for poorly targeted communications: When do they tune them out and filter our messages straight to spam?
a statement made to reply to a question or criticism
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
The takeaway, whether you are a tiny startup or a giant insurance company, is that we unconsciously use optimization whenever we decide how to get to where we want to go.
the condition of being susceptible to harm or injury
The danger in this hill-climbing approach is that if the steps are too small, we may get stuck at one of the many local maxima in the foothills, which will not tell us the best set of controllable inputs.
As predictive modeling and optimization become more vital to a wide variety of activities, look out for the engineers to disrupt industries that wouldn’t immediately appear to be in the data business.
As predictive modeling and optimization become more vital to a wide variety of activities, look out for the engineers to disrupt industries that wouldn’t immediately appear to be in the data business.
Then, Google came along and transformed online search by beginning with a simple question: What is the user’s main objective in typing in a search query?
Note here the different levels: models of individual components, tied together in a simulation given a set of inputs, iterated through over different input sets in a search optimizer.”
They can also explore how the distribution of profit is shaped by the inputs outside of the insurer’s control: “What if the economy crashes and the customer loses his job?
data they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose.
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
group of genetically related organisms in a line of descent
We introduced the Drivetrain Approach to provide a framework for designing the next generation of great data products and described how it relies at its heart on optimization.
an arrangement of objects or people side by side in a line
A company like Amazon represents every purchase that has ever been made as a giant sparse matrix, with customers as the rows and products as the columns.
Step 4 of the Drivetrain Approach for Google is now part of tech history: Larry Page and Sergey Brin invented the graph traversal algorithm PageRank and built an engine on top of it that revolutionized search.
in a good or satisfactory manner or to a high standard
There are many different optimization techniques to choose from (see see sidebar, below ), but it is a well-understood field with robust and accessible solutions.
Instead of the femme-bot voice of the GPS unit telling us which route to take and where to turn, what would it take to build a car that would make those decisions by itself?
applying the mind to learning and understanding a subject
But those models did not solve the pricing problem, so the insurance companies would set a price based on a combination of guesswork and market studies.
a visible mass of water or ice particles suspended at a considerable altitude
It is easy to stumble into the trap of thinking that since data exists somewhere abstract, on a spreadsheet or in the cloud, that data products are just abstract algorithms.
up to the immediate present; most recent or most up-to-date
Here is a screenshot of the “Customers Who Bought This Item Also Bought” feed on Amazon from a search for the latest book in Terry Pratchett’s “ Discworld series :”
All of the recommendations are for other books in the same series, but it’s a good assumption that a customer who searched for “Terry Pratchett” is already aware of these books.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
a category of things distinguished by a common quality
As predictive modeling and optimization become more vital to a wide variety of activities, look out for the engineers to disrupt industries that wouldn’t immediately appear to be in the data business.
It is easy to stumble into the trap of thinking that since data exists somewhere abstract, on a spreadsheet or in the cloud, that data products are just abstract algorithms.
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
objective that the insurance company was trying to achieve: setting a price that maximizes the net-present value of the profit from a new customer over a multi-year time horizon, subject to certain constraints such as maintaining market share.
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
She cut through the chaff of the obvious to make a recommendation that will send the customer home with a new book, and returning to Strand again and again in the future.
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
Insurers have centuries of experience in prediction, but as recently as 10 years ago, the insurance companies often failed to make optimal business decisions about what price to charge each new customer.
limited or below average in number or quantity or magnitude
Many optimization procedures are iterative; they can be thought of as taking a small step, checking our elevation and then taking another small uphill step until we reach a point from which there is no direction in which we can climb any higher.
The Strand bookseller made a brilliant but far-fetched recommendation probably based more on the character of Morrison’s writing than superficial similarities between Morrison and other authors.
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
This is not to say that Amazon’s recommendation engine could not have made the same connection; the problem is that this helpful recommendation will be buried far down in the recommendation feed, beneath books that have more obvious similarities to “Beloved.”
We are entering the era of data as drivetrain, where we use data not just to generate more data (in the form of predictions), but use data to produce actionable outcomes.
a material made of cellulose pulp derived mainly from wood or rags or certain grasses
One of the authors of this paper was explaining an iterative optimization technique, and the host says, “So, in a sense Jeremy, your approach was like that of doing a startup, which is just get something out there and iterate and iterate and iterate.”
Great predictive modeling is an important part of the solution, but it no longer stands on its own; as products become more sophisticated, it disappears into the plumbing.
excavation from which ores and minerals are extracted
A great image for optimization in the real world comes up in a recent TechZing podcast with the co-founders of data-mining competition platform Kaggle .
a specific piece of work required to be done as a duty
They can also explore how the distribution of profit is shaped by the inputs outside of the insurer’s control: “What if the economy crashes and the customer loses his job?
levers the insurance company could control: what price to charge each customer, what types of accidents to cover, how much to spend on marketing and customer service, and how to react to their competitors’ pricing decisions.
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
While the insurers were reluctant to conduct these experiments on real customers, as they’d certainly lose some customers as a result, they were swayed by the huge gains that optimized policy pricing might deliver.
guided by experience and observation rather than theory
ODG approached this problem with an early use of the Drivetrain Approach and a practical take on step 4 that can be applied to a wide range of problems.
a mercantile establishment for the sale of goods or services
This encompasses all the interactions that a retailer has with its customers outside of the actual buy-sell transaction, whether making a product recommendation, encouraging the customer to check out a new feature of the online store, or sending sales promotions.
Making the wrong choices comes at a cost to the retailer in the form of reduced margins (discounts that do not drive extra sales), opportunity costs for the scarce real-estate on their homepage (taking up space in the recommendation feed with products the customer doesn’t like or would have bought without a recommendation) or the customer tuning out (sending so many unhelpful email promotions that the customer filters all future communications as spam).
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.
Someone using Google’s self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work.
not the same one or ones already mentioned or implied
Google realized that the objective was to show the most relevant search result; for other companies, it might be increasing profit, improving the customer experience, finding the best path for a robot, or balancing the load in a data center.
workplace or land used for growing crops or raising animals
These firms have plenty of experience building models of each of the components and systems in their final product, whether they’re building a server farm or a fighter jet.
The operator can adjust the input levers to answer specific questions like, “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in year two?”
Insurers have centuries of experience in prediction, but as recently as 10 years ago, the insurance companies often failed to make optimal business decisions about what price to charge each new customer.
They also considered inputs outside of their control, like competitors’ strategies, macroeconomic conditions, natural disasters, and customer “stickiness.”
a way of doing something, especially a systematic way
We don’t claim that the Drivetrain Approach is the best or only method; our goal is to start a dialog within the data science and business communities to advance our collective vision.
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
the extent of a two-dimensional surface within a boundary
In another area where objective-based data products have the power to change lives, the CMU extension in Silicon Valley has an active project for building data products to help first responders after natural or man-made disasters .
What we would really like to do is emulate the experience of Mark Johnson, CEO of Zite , who gave a perfect example of what a customer’s recommendation experience should be like in a recent TOC talk .
While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 of the search results.
a limited period of time during which something lasts
The objective is to escape a recommendation filter bubble , a term which was originally coined by Eli Pariser to describe the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
Jeannie Stamberger of Carnegie Mellon University Silicon Valley explained to us many of the possible applications of predictive algorithms to disaster response, from text-mining and sentiment analysis of tweets to determine the extent of the damage, to swarms of autonomous robots for reconnaissance and rescue, to logistic optimization tools that help multiple jurisdictions coordinate their responses.
The objective of a recommendation engine is to drive additional sales by surprising and delighting the customer with books he or she would not have purchased without the recommendation .
a more or less definite period of time now or previously present
These products range from weather forecasting to recommendation engines to services that predict airline flight times more accurately than the airline itself.