BayesiaLab Seminars in Washington, D.C.
Artificial Intelligence & Big Data No Data
Knowledge Elicitation, Probabilistic Reasoning, and Optimization
with Bayesian Networks and BayesiaLab
Washington DC Economic Partnership (WDCEP)
1495 F Street NW, Pepco Conference Room, Washington, DC 20004
The Seminar at a Glance
In this half-day seminar, we illustrate that dealing with problem domains of which we have little or no data does not mean that we are limited to purely qualitative methods for decision support. Rather, the opposite is true. We demonstrate that human cognitive biases make it mandatory to reason formally and quantitatively about problems that appear qualitative given the absence of data (see Background & Motivation).
No Data, No AI?
Whereas we typically associate Artificial Intelligence with Big Data, we now plan to employ AI at the opposite end of the spectrum, where we have no data. More specifically, we propose using Bayesian networks as a form of Artificial Intelligence that makes formal reasoning and optimization possible under such conditions.
In the practical part of the seminar, we use BayesiaLab to generate a Bayesian network structure that represents various aspects of air travel, including common complications for passengers (see Bayesian Networks & BayesiaLab, Chapter 4). Furthermore, we employ the Bayesia Expert Knowledge Elicitation Environment to extract tacit knowledge from the seminar participants about this domain, which should be all too familiar.
On the basis of this newly-generated Bayesian network, we can reason probabilistically about the implications of actual observations or hypothetical scenarios. Of particular interest are diagnostic inference tasks, i.e. reasoning from observed effects back to the not-directly-observable cause. Our example illustrates that reasoning correctly about a seemingly simple travel problem can become intractable—both cognitively and mathematically—unless we employ Bayesian networks as a reasoning framework. The example also proves how valuable even vague and diverse opinions can become if we systematically elicit them from stakeholders and encode them in a Bayesian network.
Optimization Under Uncertainty
In addition to ad hoc inference, we can use the Bayesian network in conjunction with BayesiaLab's search algorithms to identify the optimal course of action for achieving the desired outcome, given current conditions and uncertainties and while also taking into account any costs and utilities.
Value of Information
Finally, the Bayesian network allows us to quantify the value of information and measure the consistency (or conflict) of different pieces of actual or hypothetical evidence. Most importantly, the network can identify those pieces of yet-to-be-observed evidence that would reduce our uncertainty the most or would have the greatest impact on determining the optimal course of action.
Who Should Attend?
Decision scientists, strategists, policy makers, policy analysts, military planners, lawyers, intelligence analysts, business analysts, knowledge engineers, operations research analysts, supply chain experts, epidemiologists, etc.
Bayesian networks, knowledge engineering, knowledge elicitation, uncertainty, entropy, mutual information, Kullback-Leibler Divergence, inference, diagnosis, simulation, Bayes factor, decision making, optimization, strategy, value of information.
Participants who complete this seminar program will automatically receive a BayesiaLab certificate after the event. All BayesiaLab badges and digital credits are managed via Credly, which allows you to accept your credits and share them via social media platforms.
Please remember that Bayesian network skills are in high demand these days, so don't delay in claiming and posting your credits!
Throughout the seminar, we alternate slide presentations and practical software demonstrations. We encourage a lively dialogue throughout the seminar, so there is plenty of opportunity for Q&A. Also, we'll have two coffee breaks for networking and offline questions.
Complexity & Cognitive Challenges
It is presumably fair to state that reasoning in complex environments creates cognitive challenges for humans. Adding uncertainty to our observations of the problem domain, or even considering uncertainty regarding the structure of the domain itself, makes matters worse. When uncertainty blurs so many premises, it can be particularly difficult to find a common reasoning framework for a group of stakeholders.
No Data, No Analytics.
If we had hard observations from our domain in the form of data, it would be quite natural to build a traditional analytic model for decision support. However, the real world often yields only fragmented data or no data at all. It is not uncommon that we merely have the opinions of individuals who are more or less familiar with the problem domain.
To an Analyst With Excel, Every Problem Looks Like Arithmetic.
In the business world, it is typical to use spreadsheets to model the relationships between variables in a problem domain. Also, in the absence of hard observations, it is reasonable that experts provide assumptions instead of data. Any such expert knowledge is typically encoded in the form of single-point estimates and formulas. However, using of single values and formulas instantly oversimplifies the problem domain: firstly, the variables, and the relationships between them, become deterministic; secondly, the left-hand side versus right-hand side nature of formulas restricts inference to only one direction.
Taking No Chances!
Given that cells and formulas in spreadsheets are deterministic and only work with single-point values, they are well suited for encoding “hard” logic, but not at all for “soft” probabilistic knowledge that includes uncertainty. As a result, any uncertainty has to be addressed with workarounds, often in the form of trying out multiple scenarios or by working with simulation add-ons.
It Is a One-Way Street!
The lack of omni-directional inference, however, may the bigger issue in spreadsheets. As soon as we create a formula linking two cells in a spreadsheet, e.g. B1=function(A1), we preclude any evaluation in the opposite direction, from B1 to A1.
Assuming that A1 is the cause, and B1 is the effect, we can indeed use a spreadsheet for inference in the causal direction, i.e. perform a simulation. However, even if we were certain about the causal direction between them, unidirectionality would remain a concern. For instance, if we were only able to observe the effect B1, we could not infer the cause A1, i.e. we could not perform a diagnosis from effect to cause. The one-way nature of spreadsheet computations prevents this.
Bayesian Networks to the Rescue!
Bayesian networks are probabilistic by default and handle uncertainty “natively.” A Bayesian network model can work directly with probabilistic inputs, probabilistic relationships, and deliver correctly computed probabilistic outputs. Also, whereas traditional models and spreadsheets are of the form y=f(x), Bayesian networks do not have to distinguish between independent and dependent variables. Rather, a Bayesian network represents the entire joint probability distribution of the system under study. This representation facilitates omni-directional inference, which is what we typically require for reasoning about a complex problem domain.
Everybody is talking about "Big Data" and all the opportunities that are associated with it. Very often, though, we hear almost as much about the challenges that come with this flood of data. Where to store it, how to analyze it, how to explain it, the list goes on and on. We think this is a very nice problem to have. Much more serious problems exist on the opposite end of the spectrum, where there is not enough data. Unfortunately, all the advanced knowledge discovery algorithms fail in the absence of data.
In over ten years of continuous development, and in increasingly sophisticated ways, BayesiaLab has permitted deriving knowledge from data through its machine learning algorithms, very much in the spirit of understanding "Big Data." However, BayesiaLab has maintained an equal focus on managing knowledge that exists beyond measurable and countable data points, such as the knowledge contained in the human mind. BayesiaLab's graphical user interface has made it highly intuitive for individual subject matter experts to encode their own domain understanding into a Bayesian network, thus capturing what they explicitly or implicitly know. What is especially important, one can very easily and formally capture causal directions in a Bayesian network graph, which is something that few other frameworks can do.
However, when it comes to consolidating the collective knowledge from a group of experts, rather than from an individual, the process is not as straightforward any longer. Traditionally, one would perhaps bring the experts together in a brainstorming session and let them form a common understanding. Subsequently, such a consensus could be encoded manually. However, brainstorming sessions are prone to introducing a wide range of biases, which can be disastrously counterproductive in studying complex domains.
Bayesia Expert Knowledge Elicitation Environment, or BEKEE for short, is a new web application that is designed to minimize detrimental group biases. The central idea is not to coerce consensus, but rather to elicit everyone's individual views regarding the domain under study. In order to ensure the independent elicitation of probabilities, BEKEE queries stakeholders individually via an interactive questionnaire linked to the core BayesiaLab application. Retrieving expert views in such a fashion generates many "parallel universes" in terms of domain understanding. These different perspectives can be formally compared by the facilitator and potentially returned to the group for a formal debate in the case of seriously conflicting assessments.
In most cases, this is an iterative process and, even if stakeholder opinions do not converge, BayesiaLab will compile all views and produce a unifying Bayesian network. This graph is now the mathematically correct summary of all the available expert opinions. As such, it can be utilized as a formal representation of the underlying domain. Most importantly, this graph is not merely a qualitative illustration. Rather, a Bayesian network is a fully computable model of the domain, which immediately facilitates the simulation of what-if scenarios.
In fact, we can evaluate this Bayesian network model the same way as a statistical model estimated from "Big Data." One might still prefer a data-based model, if data were indeed available, but in the absence thereof, the formally-encoded collective expert knowledge best represents what is known at the time.
Frequently Asked Questions
What does it cost to participate?
Registration for this event is free of charge.
What educational background is required to understand the seminar program?
If you are proficient in college-level math and core statistical techniques, such as specifying and estimating multivariate regressions, you will find our seminar program useful.
How should I prepare for the seminar?
We recommend that you read the first three chapters of our book on Bayesian networks. You can download it for free via this link: bayesia.com/book. Also, Chapter 4 in the book will give you a good preview of the case study to be presented in the seminar.
Will the event be broadcast live?
No, you will need to by physically present to participate in the event. Also, the event will not be recorded.
Do I need to bring a computer?
No, you do not need to bring a computer. The instructor will present several practical examples during the seminar, but there will be not enough time for you to run these examples simultaneously during the event. However, you can obtain a BayesiaLab trial version afterward, so you can replicate the examples on your own.
Will you serve lunch at the event?
No, we will not serve a meal at the seminar. Please have lunch on your own beforehand or feel free to bring your lunch.