MERN Stack in AWS - amazon-web-services

I am new to MERN stack and managed to build an app. I want to deploy it in AWS. But the problem I have to use document DB instead of Mongo DB. Do I need to rewrite my code to do this. Can I use the same mongoose methods? Please help. I am very new to this.

DocumentDB is API compatible with MongoDB for the most part, that's its whole claim to fame, so you most likely won't have to change anything.
There are however some limitations and differences between the systems, which are documented here (Unfortunately the article is too long to briefly summarize it here, so I'm just going to include the list of subtopics - check out the docs for more details).
Admin Databases and Collections
cursormaxTimeMS
explain()
Field Name Restrictions
Index Builds
Lookup with empty key in path
MongoDB APIs, Operations, and Data Types
mongodump and mongorestore Utilities
Result Ordering
Retryable Writes
Sparse Index
Storage Compression
Using $elemMatch Within an $all Expression
$distinct and $elemMatch Indexing
$lookup

Related

Can Elasticsearch be used as a database in a Django Project?

Can we directly use Elasticsearch as a primary database in Django? I have tried finding the solution or a way out but could not find any relevant information. Everywhere it is said that we can use Elasticsearch as a search engine over any other primary database. But as per my understanding Elasticsearch is a NoSQL Database, so there should be a way to use it as a Primary Database in a Django Project.
Please help, if someone has any idea about it.
The short answer is no.
SO already has an answer here and this is still valid: Using ElasticSearch as Database/Storage with Django
ES is not a ACID compliant
Indexing is not immediate so any kind of load would be an issue
It's very weakly consistent
Use it together with a proper database and it will help with real time searches, analytics, expensive queries etc. but treat it as derived data.

Updating all entities of KIND in Google Cloud Datastore

we have a dataset of ~10 million entities or a certain Kind in Datastore. We want to change the products functionality, so we would like to change the fields on all Kind entities.
Is there a smart/quick way to do it, that does not involve iterating over all of the entities in series?
Probably you can use Dataflow to help you with your problem.
Dataflow is a stream and batch data processing service, fully managed by GCP.
It was open sourced in the Apache Beam project. It is fully compatible with this SDK. This allows you to test your developments locally before run them on GCP.
It exposes two main concepts, a PCollection, basically the data that is being handled by the tool, and pipelines, the different steps necessary to capture the data, the transformations that must be performed, and how and where the results obtained should be written.
It provides support for Java, Python and Go, and a rich feature set and variety of possible data sources and transformations.
In the specific case of Datastore, Dataflow provides support for read, write and delete data. See for instance the relevant documentation for Python.
You can see a good example of how to interact with datastore in the Apache Beam Github repository.
These two other articles could be also interesting: 1 2.
I would presume that you have to loop through each one and update it as it's a NoSQL data store like mongo from what I can see. We have a system that uses SQL and Mongo and the demoralised data is a pain, we had to write migrations that would loop through all and update.

AWS hosted application need autosuggest & wild card search feature

I 'm not sure if this is the correct platform to ask architecture related question, actually I have a webapplication developed in nodejs & typescript hosted in AWS, and the backend is mongodb and my requirement is to include a search box with wild card & auto suggest search functionality so when I start typing on the text box, it will autosuggest just like we do in google search, so how would I achieve this, querying everytime to mongodb will be kind of slow and if 100's of user start doing that, then my application might start dangling so need your suggestion.
Not tried as this more of architecture help required
Not tried as this more of architecture help required
It's not a very detailed answer but may point you in a direction.
I just built something similar using AWS Lambda, ElasticSearch and API Gateway.
ElasticSearch is great for text searches but needs to be populated with indexed data.
If your dataset is changing, you will have to remember about updating ElasticSearch.
API Gateway routes requests from HTTP to Lambda, of which there are two:
one for analysing data in my data warehouse and producing indices for ElasticSearch, the other for doing the actual search and returning results.

Using Amazon Redshift for analytics for a Django app with Postgresql as the database

I have a working Django web application that currently uses Postgresql as the database. Moving forward I would like to perform some analytics on the data and also generate reports etc. I would like to make use of Amazon Redshift as the data warehouse for the above goals.
In order to not affect the performance of the existing django web application, I was thinking of writing a NEW Django application that essentially would leverage a READ-ONLY replica of the Postgresql database and continuously write data from read-only replicas to the Amazon Redshift. My thinking is that perhaps the NEW Django application can be used to handle some/all of the Extract, Transform and Load functions
My questions are as follows:
1. Does the Django ORM work well with Amazon Redshift? If yes, how does one handle the model schema translations? Any pointers in this regard would be greatly appreciated.
2. Is there any better alternative to achieve the goals listed above?
Thanks in advance.

Data Warehouse and Django

This is more of an architectural question than a technological one per se.
I am currently building a business website/social network that needs to store large volumes of data and use that data to draw analytics (consumer behavior).
I am using Django and a PostgreSQL database.
Now my question is: I want to expand this architecture to include a data warehouse. The ideal would be: the operational DB would be the current Django PostgreSQL database, and the data warehouse would be something additional, preferably in a multidimensional model.
We are still in a very early phase, we are going to test with 50 users, so something primitive such as a one-column table for starters would be enough.
I would like to know if somebody has experience in this situation, and that could recommend me a framework to create a data warehouse, all while mantaining the operational DB with the Django models for ease of use (if possible).
Thank you in advance!
Here are some cool Open Source tools I used recently:
Kettle - great ETL tool, you can use this to extract the data from your operational database into your warehouse. Supports any database with a JDBC driver and makes it very easy to build e.g. a star schema.
Saiku - nice Web 2.0 frontend built on Pentaho Mondrian (MDX implementation). This allows your users to easily build complex aggregation queries (think Pivot table in Excel), and the Mondrian layer provides caching etc. to make things go fast. Try the demo here.
My answer does not necessarily apply to data warehousing. In your case I see the possibility to implement a NoSQL database solution alongside an OLTP relational storage, which in this case is PostgreSQL.
Why consider NoSQL? In addition to the obvious scalability benefits, NoSQL offer a number of advantages that probably will apply to your scenario. For instance, the flexibility of having records with different sets of fields, and key-based access.
Since you're still in "trial" stage you might find it easier to decide for a NoSQL database solution depending on your hosting provider. For instance AWS have SimpleDB, Google App Engine provide their own DataStore, etc. However there are plenty of other NoSQL solutions you can go for that have nice Python bindings.