Create database tables and import POS/NER training/testing data to the database.
More...
◆ crf_test_data()
void crf_test_data |
( |
text |
datapath | ) |
|
- Input
- Prepare an input test data segment table, e.g.:
- CREATE TABLE test_segmenttbl (start_pos integer,doc_id integer,seg_text text, max_pos integer)
sql> select * from test_segmenttbl order by doc_id, start_pos;
start_pos | doc_id | seg_text | max_pos
----------+---------+--------------+-------------
0 | 1 | the | 26
1 | 1 | madlib | 26
2 | 1 | mission | 26
3 | 1 | : | 26
4 | 1 | to | 26
5 | 1 | foster | 26
6 | 1 | widespread | 26
7 | 1 | development | 26
8 | 1 | of | 26
9 | 1 | scalable | 26
10 | 1 | analytic | 26
11 | 1 | skills | 26
12 | 1 | , | 26
13 | 1 | by | 26
...
24 | 1 | open-source | 26
25 | 1 | development | 26
26 | 1 | . | 26
- Usage
- create tables and import data to the database SELECT madlib.crf_test_data('/path/to/modeldata')
◆ crf_train_data()
void crf_train_data |
( |
text |
datapath | ) |
|
- Input
- Prepare an input train data segment table, e.g.:
- CREATE TABLE train_segmenttbl (start_pos integer,doc_id integer,seg_text text, max_pos integer)
sql> select * from train_segmenttbl order by doc_id, start_pos;
start_pos | doc_id | seg_text | max_pos
----------+---------+--------------+-------------
0 | 1 | madlib | 9
1 | 1 | is | 9
2 | 1 | an | 9
3 | 1 | open-source | 9
4 | 1 | library | 9
5 | 1 | for | 9
6 | 1 | scalable | 9
7 | 1 | in-database | 9
8 | 1 | analytics | 9
9 | 1 | . | 9
0 | 2 | it | 16
1 | 2 | provides | 16
2 | 2 |data-parallel | 16
3 | 2 |implementations| 16
...
14 | 2 | unstructured | 16
15 | 2 | data | 16
16 | 2 | . | 16
- Prepare an input dictionary table, e.g.,:
- Prepare an input label table, e.g.,:
- Prepare an input regex table, e.g.,:
- Prepare an input feature table, e.g.,:
- Prepare an crf feature set table, e.g.,:
- Usage
- create tables and import data to the database SELECT madlib.crf_train_data('/path/to/modeldata')