Data

Data

Open Data Source

This menu item allows opening the file or the database selector and then starts the Data Import Wizard.

  • Text *file:* Once the file is read and the pre-processing done, a fully unconnected network is created in a new graph window, each attribute having one corresponding node. The set of Bayesian network learning methods becomes then available.

  • Database: Once the database table is loaded and the pre-processing done, a fully unconnected network is created in a new graph window, each attribute having one corresponding node. The set of Bayesian network learning methods becomes then available.

  • Recent databases: Keep a list of the recently opened databases. The Data importation wizard is directly opened on the selected file. The size of this list can be modified through the settings Menus .

Associate Data Source

This menu item allows opening the Data association wizard in order to associate data from a text file or a database with an existing Bayesian network.

  • Recent databases: Keep a list of the recently opened databases. The Data association wizard is directly opened on the selected file. The size of this list can be modified through the settings Menus .

When the network structure is modified during the association (addition of nodes or states), the conditional probability tables are automatically recomputed from the database. If the structure re- mains unmodified, the conditional probability tables are not modified.

Associate Dictionary

This menu item allows defining the properties of the active Bayesian network thanks to text files. These properties concern arcs, nodes and states:

  • Arc:

    • Arcs: allows associating a set of arcs to the network. The indicated arcs can be added or removed from the network. The arc removal will always be done before adding an arc. Before adding an arc, all the constraints belonging to the Bayesian network as well as the arc constraints and the temporal indices will be checked. If a constraint is not verified, then the arc won't be added.

    • Forbidden Arcs: allows associating with the network a set of forbidden arcs .

    • Arc Comments: allows associating with the network a set of arc comments .

    • Arc Colors: allows associating with the network a set of colors on the arcs.

    • Fixed Arcs: allows defining if some arcs are fixed or not.

  • Node:

    • Node Renaming: allows renaming each node with a new name. These new names must be, of course, all different.

    • Comments: allows associating a comment with each node that is in the file.

    • Classes: allows organizing nodes in subsets called classes . A node can belong to several classes at the same time. These classes allow generalizing some node's properties to the nodes belonging to the same classes. They allow also creating constraints over the arc creation during learning.

    • Colors: allows associating colors with the nodes or classes that are in the file. The colors are written as Red Green Blue with 8 bits by channel in hexadecimal format (web format): for example the color red is 255 red 0 green 0 blue, it will give FF0000. Green gives 00FF00, yellow gives FFFF00, etc.

    • Images: allows associating colors with the nodes or classes that are in the file. The images are represented by their path relatively to the directory where the dictionary is.

    • Costs: allows associating with each node a cost . A node without cost is called not observable.

    • Temporal Indices: allows associating temporal indices with the nodes that are in the file. These temporal indexes are used by the BayesiaLab's learning algorithms to take into account any constraints over the probabilistic relations, as for example the no adding arcs between future nodes to past nodes. The rule that is used to add an arc from node N1 to node N2 is:

    • If the temporal index of N1 is positive or null, then the arc from N1 to N2 is only possible if the temporal index of N2 is greater of equal to the index of N1.

    • Local Structural Coefficients: allows setting the local structural coefficient of each specified node or each node of each specified class.

    • State Virtual Numbers: allows setting the state virtual number of each specified node or each node of each specified class.

    • Locations: allows setting the position of each node.

  • State:

    • State Renaming: allows renaming each state of each node with a new name.

    • State Values: allows associating with each state of each node a numerical value .

    • State Long Names: allows associating with each state of each node a long name more explicit than the default state name. This name can be used in the different ways to export a database, in the html reports and in the monitors.

    • Filtered States: allows defining a state to each node as a filtered state .

Dictionary File Structures

Arc

Arcs

Name of the arc's starting node or class, -> , <- or even -- to indicate the both possible orientations, name of the arc's ending node or class, Equal, Space or Tab , true for an added arc or false for a removed arc. The last occurrence is always chosen.

Forbidden Arcs

Name of the arc's starting node or class, -> , <- or even -- to indicate the both possible orientations, name of the arc's ending node or class.

Comments

Name of the arc's starting node or class, -> , <- or even -- to indicate the both possible orientations, name of the arc's ending node or class, Equal, Space or Tab , comment . The comment can be any character string without return (in html or not). The last occurrence is always chosen.

Colors

Name of the arc's starting node or class, -> , <- or even -- to indicate the both possible orientations, name of the arc's ending node or class, Equal, Space or Tab , color . The color is defined as Red Green Blue 8 bits by channel color written in hexadecimal (web format). For example green gives 00FF00, yellow gives FFFF00, blue gives 0000FF, pink gives FFC0FF,etc. The last occurrence is always chosen.

Fixed Arcs

Name of the arc's starting node or class, -> , <- or even -- to indicate the both possible orientations, name of the arc's ending node or class, Equal, Space or Tab , true for an fixed arc or false for a not fixed arc. The last occurrence is always chosen.

Node

Node Renaming

Name of the node Equal, Space or Tab new node name. The new name must be valid (different from t or T and without?). A node can be present only once otherwise the last occurrence is chosen.

Comments

Name of the node or the class Equal, Space or Tab Comment. The comment can be any character string without return (in html or not). A node can be present only once otherwise the last occurrence is chosen.

Classes

Name of the node Equal, Space or Tab Name of the class. The class can be any character string. A node present several times will be associated with different classes.

Colors

Name of a node or a class Equal, Space or Tab Color The color is defined as Red Green Blue 8 bits by channel color written in hexadecimal (web format). For example green gives 00FF00, yellow gives FFFF00, blue gives 0000FF, pink gives FFC0FF, etc. A node can be present only once otherwise the last occurrence is chosen.

Images

Name of a node or a class Equal, Space or Tab path to the image relatively to the directory where the dictionary is. The image path must be a valid relative path or an empty string. A node can be present only once otherwise the last occurrence is chosen.

Costs

Name of the node Equal, Space or Tab value of the cost or empty if we want the node to be not observable. The cost is an empty string or a real number superior or equal to 1. A node can be present only once otherwise the last occurrence is chosen.

Temporal Indices

Name of the node Equal, Space or Tab value of the index or empty if we want to delete an already existent index The index is an integer. A node can be present only once otherwise the last occurrence is chosen.

Local Structural Coefficients

Name of the node Equal, Space or Tab value of the local structural coefficient or empty if we want to reset to the default value 1. The local structural coefficient is an empty string or a real number superior to 0. A node can be present only once otherwise the last occurrence is chosen.

State Virtual Numbers

Name of the node Equal, Space or Tab virtual number of states or empty if we want to delete an already existent number. The state virtual number is an empty string or an integer superior or equal to 2. A node can be present only once otherwise the last occurrence is chosen.

Locations

Name of the node Equal, Space or Tab , position. The location is represented by two real numbers separated by a Space . The first number represent the x-coordinate of the node and the second number the y-coordinate. A node can be present only once otherwise the last occurrence is chosen.

State

State Renaming

Name of the node or class dot (.) name of the state Equal, Space or Tab new state name or State name Equal, Space or Tab new state name if we want to rename the state for all nodes. The new name is a valid state name. A state can be present only once otherwise the last occurrence is chosen.

State Values

Name of the node or class dot (.) name of the state Space or Tab real value or Name of the state Equal, Space or Tab real value if we want to associate a value with a state whatever the node. The value is a real number. A state can be present only once otherwise the last occurrence is chosen.

State Long Names

Name of the node or class dot (.) name of the state Equal, Space or Tab long name or Name of the state Equal, Space or Tab long name if we want to associate a long name with a state whatever the node. The long name is a string. A state can be present only once otherwise the last occurrence is chosen.

Filtered States

Name of the node or class dot (.) name of the filtered state. Name of the filtered state if we want to set the filter property to the state whatever the node. A state can be present only once otherwise the last occurrence is chosen.

As indicated by the syntax, the name of the node, class or state in the text file cannot contain equal, space or tab characters. If the node names contain such characters in the networks, those characters must be written with a {color} (backslash) character before in the text file: for example the node named Visit Asia will be written Visit\ Asia in the file.

In order to specifically differenciate a nam which is the same for a classe, a node or a state, you must add at the end of the name the suffix "c" for a class, "n" for a node and "s" for a state.

If your network contains not-ASCII characters, you must save your own dictionaries with UTF-8 (Unicode) encoding. For example, in MS Excel, choose "save as" and select "Text Unicode (*.txt)" as type of file. In Notepad, choose "save as" and select "UTF-8" as encod- ing. If your file contains only ASCII character you can let the default encoding (depending on the platform) but it is strongly encouraged to use UTF-8 (Unicode) encoding in order to create dictionary files that doesn't depend on the user's platform. So, for example, a chinese dictionary can be read by a german without any problem whatever the used platforms are. If you are not sure how to save a file with UTF-8 encoding, you should export a dictionary with BayesiaLab, modify and save it (with any text editor) and load it in BayesiaLab.

Export Dictionary

This menu item allows exporting the different kinds of dictionaries in text files.

The dictionary files are saved with UTF-8 (Unicode) encoding in order to support any character of any language. An option, in the Import and Associate preferences: Save Format , allows saving or not the BOM (Byte Order Mask) at the beginning of the file. The BOM increases the compatibility with Microsoft applications. On other platform like Unix, Linux or Mac OS X, the BOM is not necessary and, in come cases, is considered as simple extra characters at the beginning of the file.

Associate an Evidence Scenario File

This menu item allows associating an evidence scenario file with the network.

Export an Evidence Scenario File

This menu item allows exporting into a text file an evidence scenario file associated with the network.

Save Data

This menu item allows saving the base associated with the network including the results of the various pre-processing that have been carried out within the data importation wizard (discretization, aggregation, filtering,). If the imported database still contains missing values and if the selected algorithm to process the missing values is one of the two imputation algorithms (static or dynamic), then option will allow you to realize all your imputation tasks by saving a database without any missing values. Indeed, each missing value is replaced by taking into account its conditional probability distri- bution, returned by the Bayesian network, given all the known values of the line. If the database contains data for test and data for learning, the user can choose which kind of data he wants to save: only learning data, only test data or the whole data. It is also possible to save only the data corresponding to the selected nodes.

The states' long name can be saved instead of the states' name. The numerical values in the database associated with the continuous nodes can be saved if they exist. If there is no numerical values asso- ciated with the database and if the option is checked, the numerical values will be created by randomly generating a value in each concerned interval. If the database contains weights, they will be saved as the first column in the output file.

Imputation

Allows the imputation of the missing values of the associated database according to the mode selected in the following dialog box:

The data will be saved in the specified file and the long name of the states will be used as specified. If the database contains data for test and data for learning, the user can choose on which kind of data he wants to perform imputation: only learning data, only test data or the whole data. The states' long name can be saved instead of the states' name. The numerical values in the database associated with the continuous nodes can be saved if they exist. If there is no numerical values associated with the database and if the option is checked, the numerical values will be created by randomly generating a value in each concerned interval. However, if there are numerical values in the database, the missing numerical values will be generated from the distribution function of each interval. If the database contains weights, they will be saved as the first column in the output file.

Graphs

Opens the graph editor if a database is associated with the current network.




Last updated

Logo

Bayesia USA

info@bayesia.us

Bayesia S.A.S.

info@bayesia.com

Bayesia Singapore

info@bayesia.com.sg

Copyright © 2024 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.