Limits...
GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare.

Ali R, Siddiqi MH, Idris M, Ali T, Hussain S, Huh EN, Kang BH, Lee S - Sensors (Basel) (2015)

Bottom Line: However, due to the diverse nature of data, it is difficult to predict outcomes from it.The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets.The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, Korea. rahmanali@oslab.khu.ac.kr.

ABSTRACT
A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a "data modeler" tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

No MeSH data available.


Related in: MedlinePlus

The “data modeler” launch interface with all the supported controls.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4541854&req=5

sensors-15-15772-f004: The “data modeler” launch interface with all the supported controls.

Mentions: Using these data types, an engineer is pleased to select the appropriate data type for each attribute of the unified dataset (), as shown in Equation (16): (16)M=∑j=1massignDType(aj.dType) where dType is the list of all available data types and is a function used to assign data types to attributes of the unified data model (). To minimize error chances during assignment of correct data types to attributes and produce high quality unified dataset as output, support for the selection of target data analysis tool is provided by adding a combo box “select analysis tool”, to the “data modeler” It contains the list of all the target tools, as shown in Figure 4. Before the export operation starts, target analysis tool is selected from the combo box, which loads data types of that selected tool to the “data modeler” export manager environment. The loaded data types help knowledge engineers and domain experts to assign correct data type to each attribute and minimize errors. Figure 4 and Figure 5 (step 7) shows the process of selecting data analysis tool and assignment of appropriate data types to attributes.


GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare.

Ali R, Siddiqi MH, Idris M, Ali T, Hussain S, Huh EN, Kang BH, Lee S - Sensors (Basel) (2015)

The “data modeler” launch interface with all the supported controls.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4541854&req=5

sensors-15-15772-f004: The “data modeler” launch interface with all the supported controls.
Mentions: Using these data types, an engineer is pleased to select the appropriate data type for each attribute of the unified dataset (), as shown in Equation (16): (16)M=∑j=1massignDType(aj.dType) where dType is the list of all available data types and is a function used to assign data types to attributes of the unified data model (). To minimize error chances during assignment of correct data types to attributes and produce high quality unified dataset as output, support for the selection of target data analysis tool is provided by adding a combo box “select analysis tool”, to the “data modeler” It contains the list of all the target tools, as shown in Figure 4. Before the export operation starts, target analysis tool is selected from the combo box, which loads data types of that selected tool to the “data modeler” export manager environment. The loaded data types help knowledge engineers and domain experts to assign correct data type to each attribute and minimize errors. Figure 4 and Figure 5 (step 7) shows the process of selecting data analysis tool and assignment of appropriate data types to attributes.

Bottom Line: However, due to the diverse nature of data, it is difficult to predict outcomes from it.The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets.The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, Korea. rahmanali@oslab.khu.ac.kr.

ABSTRACT
A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a "data modeler" tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

No MeSH data available.


Related in: MedlinePlus