CISC 5640

NoSQL Database Systems

Final Exam, Spring 2021

Instructor: Ruhul Amin

I declare that I will not copy any part of this questionnaire from the class lecture or online materials. Moreover, I ensure that I will not ask any questions to anyone either directly or indirectly by any means. Overall, I will maintain honesty and integrity during exam time.

Name:Douglas Mensah__ ID:__A18685161_________

90 Minutes Exam; Date: May 14, 2021

1. Name three Big Data properties and define?

	Velocity: simply means data changing frequently
	Variety: simply means data in different formats
	Volume: data in large quantities

2. What do you mean by static vs dynamic, and structured vs unstructured data?

Dynamic data: it’s a data that changes over time or periodically updated.eg websites Static data: it’s a type of data that does not change after it’s been recorded. Eg mri scan Structured data: it’s a data that has a predefined model and organized. Unstructured data: it’s a data that has no unique form of model. Eg mp3

3. Which of the 4 types of datasets (static, dynamic, structured, unstructured) are used for RDBMS/NoSQL and why?

Structured, unstructured and dynamic. This is because Rdbms uses relational data(structured) in storing database.

4. What is the main limitation of Vertical and Horizontal scaling of a database? Explain.

Vertical scaling:it’s limited by the number of cpu,ram and applied resources that can be configured on a single machine. Horizontal scaling: Limited by the Read-to-Write ratio and communication overhead(communication imbalance)

5. Horizontal scaling requires Sharding and Replication. Why?

Sharding to achieve	Concurrent access
Replication to achieve	scalability

6. What do you mean by ACID properties? Explain with example.

Atomicity	the whole transaction occurs at once or doesnt happen at all.if an operation fails, the entire transaction is aborted. Example, if there are multiple join queries operation and final operation is to update and one join fails. The whole transaction will be aborted due to one operation failure
Consistency	this follows atomicity. Database must be consistent before and after transaction. Example, before a client performs an update request, the state of the database must be the same across the distribution and after the update operation, the updates must reflect on/at all distribution
Isolation	Transactions occur independently.example for a pair of transactions eg delete and update. It appears to update operation that either delete finished execution before update started or vice versa
Durability	changes of a transaction/commit to a database occur regardless of any server restart. system guarantees changes regardless of server restart. Example if a client performed an update operation and the response was commit successfully regardless of server restart the update must be executed.

7. Which of the ACID properties can be ensured by the 2PC protocol? Explain.

8. CAP theorem is used to explain the limitations of a distributed database. Explain each of the three properties used by this theorem:

Consistency	All clients see the same information at the same time. This means if you write data to the distributed system, you should be able to read the same data at any point in time from any nodes of the system.
Availabilty	The system continues to operate even in the presence of node failures. This means the system should always perform reads/writes on any non-failing node of the cluster successfully without any error.
Partition tolerance	The system continues to operate even in the events of system failures. if there is a partition between nodes are not able to talk to each other the system should still be functioning

9. Why ‘Loose Consistency’ is easier to implement than ‘Strict Consistency’ in a distributed database? Explain with an example.

10. Explain the BASE properties of a distributed database. Give an example of any popular service that uses this property.

BASE: Basically available: the system guarantee availability. (there could be a possibility of fault, but it still will be available for some users.) Soft state: changes might (copies of a data item may be inconsistent) Eventual Consistency: the system will eventually become consistent at a stage. This is because changes in other nodes will finally come together hence making the system consistent.


Example: Banking service

11. What benefits the following NoSQL data modeling techniques offer? Explain.

Denormalization	Denormalization is a database optimization
Aggregate	returns the computed results
Application side join

12. What are the two important principles of Sharding? Explain.

1.	Vertical partitioning
2.	Horizontal partitioning

13. What is the difference between Peer-to-Peer and Master-Slave replication? Explain with an appropriate architectural diagram for each.


Diagram 1:	Diagram 2:

14. State three differences between NoSQL and RDBMS based on Model, Data, and Schema:

NoSQL	RDBMS
schemaless	Uses schema
Unstructured data	Structured data
Document,column,key-value or graph	Relational model

15. Name 4 NoSQL Database techniques? For each type, include an example database and a service in which such a database can be used effectively than any other alternatives.

1.Document: Document-Oriented NoSQL DB stores and retrieves data as a key value pair but the value part is stored as a document. DB example: mongodb Service: attendance system

2. Key value: Key-value pair storage databases store data as a hash table where each key is unique, and the value can be a JSON, string, or predefined datatype DB example: redis Service Example: instant messaging where there can even be a time to live and message will be deleted after

3.Graph: Graph base database mostly used for social networks, logistics, spatial data To store entities and relationships between nodes and there exists so many patterns and mutual relationships. Db example: neo4j Service example: social networking apps or platforms

4. Column family: predefined and structures. Every column is treated separately. Values of single column databases are stored contiguously. Db example: Cassandra Service example: when dealing with big data and data warehousing

16. Write down MongoDB, Cassandra and Neo4J terminologies being used in place of RDBMS terminologies:

RDBMS	MongoDB	Cassandra	Neo4J
Database	Database	database	database
Table	Collection	table	node label
Row	document	Row key	node
Column	Key (key-value)	Column key	Node property
Join	embedded	join	relationships
Primary Key	_id	Primary key	Primary key

17. What are the three basic operations in Redis? Include an example for each operation.

Put hset -key -valye/field

Get hget -key -value/field

Delete hdel – key -value/field

18. Write down the three collection types (with example) you can use in the Column-Family database:

19. What are the Nodes and Edges in the Graph database? Explain a scenario in which graph database results in gaining time and space complexity.

Nodes: contains properties with key-value pairs this is basically considered as a table

Edges: can be considered as relationships and this basically connects two nodes together

Node Example: eg. Student node can have name: “Douglas”, age:’20” as key value pairs Edge example: Using the movie db example in class. Person node can have an edge/relationship by [:ACTED_IN] connected to the movie node

20. Write down the flow-chart of data processing steps (Input, Splitting, Mapping, Shuffling, Reducing, Final Results) in Map-Reduce operations for counting following words:

{Tiger, Deer, Bear, Tiger, Monkey, Bear, Tiger, Deer, Bear}

Use the next page.

Input: this is considered as the values coming into the system or database. So basically {Tiger, Deer, Bear, Tiger, Monkey, Bear, Tiger, Deer, Bear} are the inputs. Step 2 splitting: is done by isolation is group entry Step 3 mapping: consider this as a key value. Eg tiger:1 deer:1 bear:1 Step 4 shuffling: this rearranges all the key and its value as one group Step 5 reducing: this counts the values in each group per the shuffling Step 6 final results: this then gives you the total count of each word

Leave a Comment Cancel Reply