Extract table name and columns from SQL schema - regex

I need help to export table-name & columns from table schema (DDL) using regex.
CREATE TABLE todos (
id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
team_id INTEGER NOT NULL,
title TEXT NOT NULL DEFAULT "Hello World!",
description TEXT NOT NULL UNIQUE,
UNIQUE (title),
PRIMARY KEY (id),
FOREIGN KEY (user_id) REFERENCES users (id),
FOREIGN KEY (team_id) REFERENCES teams (t_id)
ON UPDATE RESTRICT
ON DELETE RESTRICT
)
Table name
todos
2. Columns
id // as group 1 (column name)
INTEGER // as group 2 (column type)
NOT NULL // as group 3 (column nullable) empty if nothing
DEFAULT // as group 4 (default value for example "Hello World")
UNIQUE // as group 5 (column uniqueable) empty if nothing
Note: UNIQUE can be also on table level same as title column.
3. Primary key
id // as group 1 (primary key)
Table level: PRIMARY\sKEY\s+\(([^\)]+)\)
Column level: check below answer.
4. Foreign keys:
// first
user_id // as group 1 (foreign key)
users // as group 2 (reference table name)
id // as group 3 (reference primary)
// second
team_id // as group 1 (foreign key)
teams // as group 2 (reference table name)
t_id // as group 3 (reference primary)
ON UPDATE RESTRICT // as group 4
ON DELETE RESTRICT // as group 5
I've found a simple regex in [github] (https://github.com/yiisoft/yii2/issues/6351#issuecomment-91064631) but not support RESTRICT
/FOREIGN KEY\s+\(([^\)]+)\)\s+REFERENCES\s+([^\(^\s]+)\s*\(([^\)]+)\)/mi

Extract a table name:
CREATE\s+TABLE\s+([\w_]+)
Get column names:
\s+([\w_]+)[\s\w]+,
Get a primary key field:
\s*PRIMARY\s+KEY\s+\(([\w_]+)\)
Get foreign keys data:
\s*FOREIGN\s+KEY\s+\(([\w_]+)\)\s+REFERENCES\s+([\w_]+)\s+\(([\w_]+)\)
You can test it here (respectively):
https://regexr.com/59251
https://regexr.com/59254
https://regexr.com/5925a
https://regexr.com/594eb

The Regex are returning results into a named captured group, you can find the name if you look here (?'GREOUP-NAME'..myregex...). It makes it easier for you to reference them after a finished regex search, it will be easier to split them.
FULL SEARCH
((?'COLUMN_NAME'(?<=^\s\s)([[:lower:]]\w+))|(?'PRIMARY_KEY'(?<=PRIMARY\sKEY\s\()(\w+))|(?'TABLE_NAME'(?<=\bTABLE\s)(\w+)))
SPLIT SEARCH
Get table name:
(?'TABLE_NAME'(?<=\bTABLE\s)(\w+))
Get primary key:
(?'PRIMARY_KEY'(?<=PRIMARY\sKEY\s\()(\w+))
Get column name: This one is a little bit sloppy and will only capture columns that are lowercase. Since your text didn't have any tabs-characters. This was the best i could do but it's a bit risky.
(?'COLUMN_NAME'(?<=^\s\s)([[:lower:]]\w+))
You can run them here, regex101, and try it out.
Be aware that the regex is dependent on whatever regex-engine your are using. There are some shortcomings regarding standards, and some regex's might need to be translated to your engine. For ex. lookbehind is not supported on all engines.

Related

DAX - How to lookup and return a value from another table based on 2 possible lookups

I have 2 tables, one has a ton of fields so I didn't copy it all but the 2 fields in the big table that I'm working with are "Item Number" and "Item Description". The smaller table is pictured below.
ItemData table
ItemNumber
ItemDescription
Entities
ProductLines
The two tables are not related; I need to have a column in the big table named "Entity" where I lookup the item number or the item description (if the item number is missing) and return what Entity is associated. If both fields are empty then return "NONE".
My current code is below and it works sometimes which doesn't make sense because the code isn't correct, I know. I also can't get it to look at one field if the other is blank which is why that part of the code has been deleted.
Entity = LOOKUPVALUE(ItemData[Entities],ItemData[Item Number],Page1_1[Item Number],"None")
Here is what I want it to say in DAX - Entity = if itemNumber is not null then use item number to retrieve the entity name, otherwise use the itemdescription to find the entity.
Here is what I would like to see:
Item number = "123"
Item Description = "Sunshine"
Entity = "Florida"
I can pull item number and description from the big table. I just need to match those with the small table to get the entity.
You can create an if statement:
Entity = IF(ISEMPTY(ItemData[Item Number]) then
LOOKUPVALUE(ItemData[Entities],ItemData[Item Description],Page1_1[Item Description]) else
LOOKUPVALUE(ItemData[Entities],ItemData[Item Number],Page1_1[Item Number]))

Slow Selection Query even after indexing the table (sqlite and c++)

Create tables
I have a database composed of two tables:
ENTITE_CANDIDATE
VARIATIONS
Tables are created by using the following queries:
CREATE TABLE IF NOT EXISTS ENTITE_CANDIDATE (ID INTEGER PRIMARY KEY NOT NULL, ID_KBP TEXT NOT NULL, wiki_title TEXT, type TEXT NOT NULL);"
CREATE TABLE IF NOT EXISTS VARIATIONS (ID INTEGER PRIMARY KEY NOT NULL, ID_ENTITE INTEGER, NAME TEXT, TYPE TEXT, LANGUAGE TEXT, FOREIGN KEY(ID_ENTITE) REFERENCES ENTITE_CANDIDATE(ID));"
Table ENTITE_CANDIDATE is composed of 818,742 records
Table VARIATIONS is composed of 154,716,653 records
Index tables
I indexed the previous tables by using the following queries:
`CREATE INDEX var_id ON VARIATIONS (ID, ID_ENTITE, NAME);`
`CREATE INDEX entity_id ON ENTITE_CANDIDATE (ID, wiki_title);`
Retrieve information
I want to retrieve from table VARIATIONS the following records:
"SELECT ID, ID_ENTITE, NAME FROM VARIATIONS WHERE NAME=foo ;"
Every select query is taking around 5.414931 seconds. I know the table contains a very large number of records. But can I make the retrieval faster? Am I indexing correctly the tables?
The documentation says:
the index might be used if the initial columns of the index … appear in WHERE clause terms.
This query uses only the NAME column to search, so the var_id index cannot be used. (That index is useful only for lookups that use ID, which is mostly useless because the ID column is already indexed as PRIMARY KEY.)

coding many to many relationship in c++ sqlite3

I trying to code a many to many relationship in c++ sqlite3.
in the diagram below,
managers can add many job opportunities.
jobs opportunities is being add by many managers
my create table statements
"CREATE TABLE Manager(" \
"manager_id INTEGER PRIMARY KEY NOT NULL,"\
"name varchar(45) NOT NULL);"
"CREATE TABLE jobs ("
"jobId INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,"\
"jobTitle varchar(45) NOT NULL);"
"CREATE TABLE Add ("
"manager_id,jobId INTEGER PRIMARY KEY NOT NULL,"\
"date varchar(45) NOT NULL,"\
"FOREIGN KEY(manager_id) REFERENCES Manager(manager_id),"\
"FOREIGN KEY(job_id) REFERENCES jobs(job_id));";
my manager table is populated with the following information
1|john
2|bob
let's say manager john has added two jobs,jobTitle jobA and jobB
then my insert statement code will look like this.http://pastebin.com/0E8CzPgX
then my jobs tables is populated with the following information
1|jobA
2|jobB
the final step is to take the id of john(manager id = 1) and the two jobsId(1,2) and add it inside
the add table. I don't have an idea of how should I code
so that the add table will become like this.
add table
manager_id|job_id|date
1 | 1 |30-01-2014
1 | 2 |30-01-2014
please advise.thanks
Do you mean something like
sql = "INSERT INTO Add(manager_id,jobId,date) VALUES (?,?,?);";
?
Your problem seems to be that you defined jobID to be the primary key of the table Add, which you don't need.
jobId INTEGER PRIMARY KEY NOT NUL
A common approach to many-to-many relations in a database is to include an intermediate table.
This intermediate table (let's call it Manager_jobs) would have at least 2 columns, both referring to other tables via foreign key. The first attribute would be the primary key of Manager, the second one the primary key of jobs.
Each time you add a job, you just add an entry to Manager_jobs with the foreign keys respectively.
So, Manager_jobs would look like this:
ManagerID | JobID
==========|======
4 | 2
3 | 2
4 | 1
As you can see, Manager_jobs can encode that a Manager has multiple jobs assigned and vice versa.
This approach, of course, requires you to have some form of primary key for both data tables.

sql Column with multiple values (query implementation in a cpp file )

I am using this link.
I have connected my cpp file with Eclipse to my Database with 3 tables (two simple tables
Person and Item
and a third one PersonItem that connects them). In the third table I use one simple primary and then two foreign keys like that:
CREATE TABLE PersonsItems(PersonsItemsId int not null auto_increment primary key,
Person_Id int not null,
Item_id int not null,
constraint fk_Person_id foreign key (Person_Id) references Person(PersonId),
constraint fk_Item_id foreign key (Item_id) references Items(ItemId));
So, then with embedded sql in c I want a Person to have multiple items.
My code:
mysql_query(connection, \
"INSERT INTO PersonsItems(PersonsItemsId, Person_Id, Item_id) VALUES (1,1,5), (1,1,8);");
printf("%ld PersonsItems Row(s) Updated!\n", (long) mysql_affected_rows(connection));
//SELECT newly inserted record.
mysql_query(connection, \
"SELECT Order_id FROM PersonsItems");
//Resource struct with rows of returned data.
resource = mysql_use_result(connection);
// Fetch multiple results
while((result = mysql_fetch_row(resource))) {
printf("%s %s\n",result[0], result[1]);
}
My result is
-1 PersonsItems Row(s) Updated!
5
but with VALUES (1,1,5), (1,1,8);
I would like that to be
-1 PersonsItems Row(s) Updated!
5 8
Can somone tell me why is this not happening?
Kind regards.
I suspect this is because your first insert is failing with the following error:
Duplicate entry '1' for key 'PRIMARY'
Because you are trying to insert 1 twice into the PersonsItemsId which is the primary key so has to be unique (it is also auto_increment so there is no need to specify a value at all);
This is why rows affected is -1, and why in this line:
printf("%s %s\n",result[0], result[1]);
you are only seeing 5 because the first statement failed after the values (1,1,5) had already been inserted, so there is still one row of data in the table.
I think to get the behaviour you are expecting you need to use the ON DUPLICATE KEY UPDATE syntax:
INSERT INTO PersonsItems(PersonsItemsId, Person_Id, order_id)
VALUES (1,1,5), (1,1,8)
ON DUPLICATE KEY UPDATE Person_id = VALUES(person_Id), Order_ID = VALUES(Order_ID);
Example on SQL Fiddle
Or do not specify the value for personsItemsID and let auto_increment do its thing:
INSERT INTO PersonsItems( Person_Id, order_id)
VALUES (1,5), (1,8);
Example on SQL Fiddle
I think you have a typo or mistake in your two queries.
You are inserting "PersonsItemsId, Person_Id, Item_id"
INSERT INTO PersonsItems(PersonsItemsId, Person_Id, Item_id) VALUES (1,1,5), (1,1,8)
and then your select statement selects "Order_id".
SELECT Order_id FROM PersonsItems
In order to achieve 5, 8 as you request, your second query needs to be:
SELECT Item_id FROM PersonsItems
Edit to add:
Your primary key is autoincrement so you don't need to pass it to your insert statement (in fact it will error as you pass 1 twice).
You only need to insert your other columns:
INSERT INTO PersonsItems(Person_Id, Item_id) VALUES (1,5), (1,8)

how to extract column parameters from sqlite create string?

in sqlite it is possible to have string by which the table was created:
select sql from sqlite_master where type='table' and tbl_name='MyTable'
this could give:
CREATE TABLE "MyTable" (`id` PRIMARY KEY NOT NULL, [col1] NOT NULL,
"another_col" UNIQUE, '`and`,''another'',"one"' INTEGER, and_so_on);
Now I need to extract with this string any additional parameters that given column name has been set with.
But this is very difficult since the column name could be enclosed with special characters, or put plain, column name may have some special characters that are used as encapsulation etc.
I don't know how to approach it. The result should be having a column name the function should return anything that is after this name and before , so giving it id it should return PRIMARY KEY NOT NULL.
Use the pragma table_info:
http://www.sqlite.org/pragma.html#pragma_table_info
sqlite> pragma table_info(MyTable);
cid|name|type|notnull|dflt_value|pk
0|id||1||1
1|col1||1||0
2|another_col||0||0
3|`and`,'another',"one"|INTEGER|0||0
4|and_so_on||0||0