Showing posts with label NoSQL. Show all posts
Showing posts with label NoSQL. Show all posts

15 March 2016

NoSQL - Primary Key/Composite Key/Partition Key/Clustering Key

The primary key is a general concept to indicate one or more columns used to retrieve data from a Table.

The primary key may be SIMPLE

 create table stackoverflow (
      key text PRIMARY KEY,
      data text    
  );
That means that it is made by a single column.

But the primary key can also be COMPOSITE (aka COMPOUND), generated from more columns.

 create table stackoverflow (
      key_part_one text,
      key_part_two int,
      data text,
      PRIMARY KEY(key_part_one, key_part_two)    
  );
In a situation of COMPOSITE primary key, the "first part" of the key is called PARTITION KEY (in this example key_part_one is the partition key) and the second part of the key is the CLUSTERING KEY (key_part_two)

Please note that the both partition and clustering key can be made by more columns

 create table stackoverflow (
      k_part_one text,
      k_part_two int,
      k_clust_one text,
      k_clust_two int,
      k_clust_three uuid,
      data text,
      PRIMARY KEY((k_part_one,k_part_two), k_clust_one, k_clust_two, k_clust_three)    
  );
Behind these names ...

The Partition Key is responsible for data distribution accross your nodes.
The Clustering Key is responsible for data sorting within the partition.
The Primary Key is equivalent to the Partition Key in a single-field-key table.
The Composite/Compund Key is just a multiple-columns key

Usage and content examples

SIMPLE KEY:
insert into stackoverflow (key, data) VALUES ('han', 'solo');
select * from stackoverflow where key='han';
table content

key | data
----+------
han | solo

COMPOSITE/COMPOUND KEY can retrieve "wide rows"

insert into stackoverflow (key_part_one, key_part_two, data) VALUES ('ronaldo', 9, 'football player');
insert into stackoverflow (key_part_one, key_part_two, data) VALUES ('ronaldo', 10, 'ex-football player');
select * from stackoverflow where key_part_one = 'ronaldo';

table content

 key_part_one | key_part_two | data
--------------+--------------+--------------------
      ronaldo |            9 |    football player
      ronaldo |           10 | ex-football player
But you can query with all key ...

select * from stackoverflow where key_part_one = 'ronaldo' and key_part_two  = 10;
query output

 key_part_one | key_part_two | data
--------------+--------------+--------------------
      ronaldo |           10 | ex-football player
     
Important note: the partition key is the minimum-specifier needed to perform a query using where clause. If you have a composite partition key, like the following

eg: PRIMARY KEY((col1, col2), col10, col4))

You can perform query only passing at least both col1 and col2, these are the 2 columns that defines the partition key. The "general" rule to make query is you have to pass at least all partition key columns, then you can add each key in the order they're set.

12 March 2016

NOSQL : CQLSH Commands

Cqlsh - Start the CQL interactive terminal.
Syntax : cqlsh [options] [host [port]]

Example
/opt/cassandra/dse-4.8.2/bin/cqlsh optrhdbcasra01adev -u cswgadmin -p admin123

CAPTURE - Captures command output and appends it to a file.

To start capturing the output of a query, specify the path of the file relative to the current directory. Enclose the file name in single quotation marks. The shorthand notation in this example is supported for referring to $HOME.
Output is not shown on the console while it is captured. Only query result output is captured. Errors and output from cqlsh-only commands still appear. To stop capturing output and return to normal display of output, use CAPTURE OFF.

Example :  CAPTURE '~/mydir/myfile.txt'

COPY -
Imports and exports CSV (comma-separated values) data to and from Cassandra.

COPY table_name ( column, ...) FROM ( 'file_name' | STDIN ) WITH option = 'value' AND ...

COPY table_name ( column , ... ) TO ( 'file_name' | STDOUT ) WITH option = 'value' AND …

Example :

CREATE KEYSPACE test
  WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1 };

USE test;

CREATE TABLE airplanes (
  name text PRIMARY KEY,
  manufacturer ascii,
  year int,
  mach float
);

INSERT INTO airplanes
  (name, manufacturer, year, mach)
  VALUES ('P38-Lightning', 'Lockheed', 1937, 0.7);

COPY airplanes (name, manufacturer, year, mach) TO 'temp.csv';

TRUNCATE airplanes;

COPY airplanes (name, manufacturer, year, mach) FROM 'temp.csv';

TRUNCATE airplanes;

COPY airplanes (name, manufacturer, year, mach) FROM STDIN;
The output is:

[Use \. on a line by itself to end input]
[copy]
At the [copy] prompt, enter the following data:
'F-14D Super Tomcat', Grumman, 1987, 2.34
'MiG-23 Flogger', Russian-made, 1964, 2.35
'Su-27 Flanker', U.S.S.R., 1981, 2.35
\.


DESCRIBE : Provides information about the connected Cassandra cluster, or about the data objects stored in the cluster.
DESCRIBE FULL ( CLUSTER | SCHEMA )
| KEYSPACES
| ( KEYSPACE keyspace_name )
| TABLES
| ( TABLE table_name )
| TYPES
| ( TYPE user_defined_type )
| INDEX
| ( INDEX index_name )

EXPAND - Formats the output of a query vertically.

This command lists the contents of each row of a table vertically, providing a more convenient way to read long rows of data than the default horizontal format. You scroll down to see more of the row instead of scrolling to the right. Each column name appears on a separate line in column one and the values appear in column two.

EXPAND ON | OFF

PAGING - Enables or disables query paging.

PAGING ON|OFF

SHOW - Shows the Cassandra version, host, or tracing information for the current cqlsh client session.

SHOW VERSION
| HOST
| SESSION tracing_session_id

SOURCE - Executes a file containing CQL statements.


SOURCE '~/mydir/myfile.txt'