Conferences >2021 IEEE International Confe...

TableNN: Deep Learning Framework for Learning Domain Specific Tabular Data

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Enterprises often have a large number of databases and other sources of tabular data with columns full of domain-specific jargon (e.g. alpha-numeric codings, undeclared a...Show More

Metadata

Abstract:

Enterprises often have a large number of databases and other sources of tabular data with columns full of domain-specific jargon (e.g. alpha-numeric codings, undeclared abbreviations, etc) which usually require domain experts to decode. Due to the jargon-specific content of the tables, no pre-trained language model such as Wiki2Vec [21] can be applied readily to encode the cell semantics due to absence of unique jorgan words or alpha-numeric codes in the model vocabulary. We propose a deep learning based framework that is ideally suited for serverless computing environment, and that 1) uses a new tokenization method, called Cell-Masking, 2) encodes the semantics of the cells into contextual embedding that exploits the locality features in tabular data, called Cell2Vec, and 3) an attention-based neural network, called TableNN, that provides a supervised learning solution to classify cell entries into predefined column classes. We apply the proposed method on three publicly available datasets of varying data sizes, from different industries. Cell-Masking provides an order of magnitude lower loss value and quickest convergence for cell embedding generation. In Cell2Vec, we demonstrate that the inclusion of row and column context improves the quality of embeddings by better loss curve convergence and improvement in accuracy by 5.4% on the BTS dataset [3].

Published in: 2021 IEEE International Conference on Big Data (Big Data)

Date of Conference: 15-18 December 2021

Date Added to IEEE Xplore: 13 January 2022

ISBN Information:

DOI: 10.1109/BigData52589.2021.9671972

Conference Location: Orlando, FL, USA

Contents

References is not available for this document.

TableNN: Deep Learning Framework for Learning Domain Specific Tabular Data

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

TableNN: Deep Learning Framework for Learning Domain Specific Tabular Data

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?