I am writing a Django command to seed an existing table,
I need to truncate the table before seeding, but there are foreign key constraints on that table.
because of that, I am getting django.db.utils.IntegrityError while truncating the table,
How do I turn the Foreign Key Checks off temporarily in Django?
I saw SET FOREIGN KEY CHECK = 0 but don't know where to put them :(
The Django Command class:
class Command(BaseCommand):
help = "Command to seed the aws regions"
regions = [
{
'name': 'Us East (N. Virginia)',
'region': 'us-east-1',
},
{
'name': 'US West (Oregon)',
'region': 'us-west-2',
},
{
'name': 'EU (Ireland)',
'region': 'eu-west-1',
},
]
def handle(self, *args, **options):
self.stdout.write('seeding regions...')
AwsRegions.objects.all().delete() # this is where i get errors
for name, region in self.regions:
self.stdout.write(region)
AwsRegions.objects.create(name, region)
self.stdout.write('done seeding regions')
Got the solution.
I had to disable the Triggers on the table to stop the foreign key constraint check.
Disable Triggers
def disable_triggers(self):
with connection.cursor() as cursor:
cursor.execute('ALTER TABLE "Table Name" DISABLE TRIGGER ALL;')
Enable Triggers
def enable_triggers(self):
with connection.cursor() as cursor:
cursor.execute('ALTER TABLE "Table Name" ENABLE TRIGGER ALL;')
Important Notes:
According to this doc link, you can pass a list as the second argument to the execute() method (eg: you might want to pass the table name dynamically), but this will automatically escape the variables and you might end up forming a syntactically wrong PostgreSQL query (which took a lot of my time to fix it)
Make sure you turn the triggers back on properly
If you are getting a Permission denied error Then you might want to check the DB user permissions, I just turned on superuser permissions from PgAdmin, which was ok for me. and everything back to work. How to do it ?
To disable triggers for all tables (useful when you need to stop it for multiple tables):
SET session_replication_role TO 'replica'
And to restore:
SET session_replication_role TO 'origin'
from django.db import connection
with connection.constraint_checks_disabled():
do_stuff()
Credit goes to https://stackoverflow.com/a/11926432/2558400
Related
we've set up AWS SecretsManager as a secrets backend to Airflow (AWS MWAA) as described in their documentation. Unfortunately, nowhere is explained where the secrets are to be found and how they are to be used then. When I supply conn_id to a task in a DAG, we can see two errors in the task logs, ValueError: Invalid IPv6 URL and airflow.exceptions.AirflowNotFoundException: The conn_id redshift_conn isn't defined. What's even more surprising is that when retrieving variables stored the same way with Variable.get('my_variable_id'), it works just fine.
The question is: Am I wrongly expecting that the conn_id can be directly passed to operators as SomeOperator(conn_id='conn-id-in-secretsmanager')? Must I retrieve the connection manually each time I want to use it? I don't want to run something like read_from_aws_sm_fn in the code below every time beforehand...
Btw, neither the connection nor the variable show up in the Airflow UI.
Having stored a secret named airflow/connections/redshift_conn (and on the side one airflow/variables/my_variable_id), I expect the connection to be found and used when constructing RedshiftSQLOperator(task_id='mytask', redshift_conn_id='redshift_conn', sql='SELECT 1'). But this results in the above error.
I am able to retrieve the redshift connection manually in a DAG with a separate task, but I think that is not how SecretsManager is supposed to be used in this case.
The example DAG is below:
from airflow import DAG, settings, secrets
from airflow.operators.python import PythonOperator
from airflow.utils.dates import days_ago
from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook
from airflow.models.baseoperator import chain
from airflow.models import Connection, Variable
from airflow.providers.amazon.aws.operators.redshift import RedshiftSQLOperator
from datetime import timedelta
sm_secret_id_name = f'airflow/connections/redshift_conn'
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': days_ago(1),
'retries': 1,
}
def read_from_aws_sm_fn(**kwargs): # from AWS example code
### set up Secrets Manager
hook = AwsBaseHook(client_type='secretsmanager')
client = hook.get_client_type('secretsmanager')
response = client.get_secret_value(SecretId=sm_secret_id_name)
myConnSecretString = response["SecretString"]
print(myConnSecretString[:15])
return myConnSecretString
def get_variable(**kwargs):
my_var_value = Variable.get('my_test_variable')
print('variable:')
print(my_var_value)
return my_var_value
with DAG(
dag_id=f'redshift_test_dag',
default_args=default_args,
dagrun_timeout=timedelta(minutes=10),
start_date=days_ago(1),
schedule_interval=None,
tags=['example']
) as dag:
read_from_aws_sm_task = PythonOperator(
task_id="read_from_aws_sm",
python_callable=read_from_aws_sm_fn,
provide_context=True
) # works fine
query_redshift = RedshiftSQLOperator(
task_id='query_redshift',
redshift_conn_id='redshift_conn',
sql='SELECT 1;'
) # results in above errors :-(
try_to_get_variable_value = PythonOperator(
task_id='get_variable',
python_callable=get_variable,
provide_context=True
) # works fine!
The question is: Am I wrongly expecting that the conn_id can be directly passed to operators as SomeOperator(conn_id='conn-id-in-secretsmanager')? Must I retrieve the connection manually each time I want to use it? I don't want to run something like read_from_aws_sm_fn in the code below every time beforehand...
Using secret manager as a backend, you don't need to change the way you use the connections or variables. They work the same way, when looking up a connection/variable, airflow follow a search path.
Btw, neither the connection nor the variable show up in the Airflow UI.
The connection/variable will not up in the UI.
ValueError: Invalid IPv6 URL and airflow.exceptions.AirflowNotFoundException: The conn_id redshift_conn isn't defined
The 1st error is related to the secret and the 2nd error is due to the connection not existing in the airflow UI.
There is 2 formats to store connections in secret manager (depending on the aws provider version installed) the IPv6 URL error could be that its not parsing the connection correctly. Here is a link to the provider docs.
First step is defining the prefixes for connections and variables, if they are not defined, your secret backend will not check for the secret:
secrets.backend_kwargs : {"connections_prefix" : "airflow/connections", "variables_prefix" : "airflow/variables"}
Then for the secrets/connections, you should store them in those prefixes, respecting the required fields for the connection.
For example, for the connection my_postgress_conn:
{
"conn_type": "postgresql",
"login": "user",
"password": "pass",
"host": "host",
"extra": '{"key": "val"}',
}
You should store it in the path airflow/connections/my_postgress_conn, with the json dict as string.
And for the variables, you just need to store them in airflow/variables/<var_name>.
I have a function create which uses 3 dynamodb tables. How do i mock three Dynamo db tables?
def create():
//This function uses a dynamodb table "x"
// Then it calls my_table() function
def my_table():
// This function basically uses two dynamodb table "y" and "z"
// This function returns a value which is used in create() function.
My test file has following code -
#patch.dict(os.environ, {"DYNAMODB_TABLE": "x",
'second_TABLE': "y",
'Third_TABLE': "z"
})
def test_create():
dynamodb_test()
event = { // my event values}
result = create(event)
assert result == 200
def dynamodb_test():
with mock_dynamodb2():
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
dynamodb.create_table(
TableName=os.environ["DYNAMODB_TABLE"],
KeySchema=[
{
'AttributeName': 'id',
'KeyType': 'HASH'
}
],
AttributeDefinitions=[
{
'AttributeName': 'id',
'AttributeType': 'S'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 1,
'WriteCapacityUnits': 1
}
)
yield dynamodb
whenever i am testing test_create() function using pytest , i am getting
botocore.exceptions.ClientError: An error occurred (ExpiredTokenException) when
calling the Scan operation: The security token included in the request is expired
I think its trying to access the actual aws dynamo db but i want it to use mock dynamodb. How can i achieve this ?
Moto only works when two conditions are met:
The logic to be tested is executed inside a Moto-context
The Moto-context is started before any boto3-clients (or resources) are created
The Moto-context in your example, with mock_dynamodb2(), is localized to the dynamodb_test-function. After the function finishes, the mock is no longer active, and Boto3 will indeed try to access AWS itself.
Solution
The following test-function would satisfy both criteria:
#patch.dict(os.environ, {"DYNAMODB_TABLE": "x",
'second_TABLE': "y",
'Third_TABLE': "z"
})
# Initialize the mock here, so that it is effective for the entire test duration
#mock_dynamodb2
def test_create():
dynamodb_test()
event = { // my event values}
# Ensure that any boto3-clients/resources created in the logic are initialized while the mock is active
from ... import create
result = create(event)
assert result == 200
def dynamodb_test():
# There is no need to start the mock-context again here, so create the table immediately
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
dynamodb.create_table(...)
The test code you provided does not talk about creating tables y and z - if the logic expects them to exist, you'd have to create them manually as well of course (just like table x was created in dynamodb_test.
Documentation for the import quirk can be found here: http://docs.getmoto.org/en/latest/docs/getting_started.html#recommended-usage
I believe this post is almost identical to yours. You could try this or utilize some of the other existing tools like localstack or dynamodb-local. The Python client for localstack for example: https://github.com/localstack/localstack-python-client
EDIT:
I see your title explains you want to use moto. I don't see you importing moto any of the moto modules into your code. See the last snippet in this page and replace s3 with either dynamodb, dynamodb2 (whichever you are using)
Hi Stackoverflow I'm trying to conditionally put an item within a DynamoDB table. The DynamoDB table has the following attributes.
ticker - Partition Key
price_date - Sort Key
price - Attribute
Every minute I'm calling an API which gives me a minute by minute list of dictionaries for all stock prices within the day so far. However, the data I receive from the API sometimes can be behind by a minute or two. I don't particularly want to overwrite all the records within the DynamoDB table every time I get new data. To achieve this I've tried to create a conditional expression to only use put_item when there is a match on ticker but there is a new price_date
I've created a simplification of my code below to better illustrate my problem.
import boto3
from boto3.dynamodb.conditions import Attr
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('stock-intraday')
data = [
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:30:00.000Z', 'price': 100},
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:31:00.000Z', 'price': 101}
]
for item in data:
dynamodb_response = table.put_item(Item=item,
ConditionExpression=Attr("ticker").exists() & Attr("price_date").not_exists())
However when I run this code I get this error...
What is wrong with my conditional expression?
Found an answer to my own problem. DynamoDB was throwing an error because my code WAS working but with some minor changes.
There needed to be a TRY EXCEPT block but also since the partition key is already evaluated only the price_date needed to be included within the condition expression
import boto3
from boto3.dynamodb.conditions import Attr
from botocore.exceptions import ClientError
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('stock-intraday')
data = [
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:30:00.000Z', 'price': 100},
{'ticker': 'GOOG', 'price_date': '2021-10-08T9:31:00.000Z', 'price': 101}]
for item in data:
try:
dynamodb_response = table.put_item(Item=item,
ConditionExpression=Attr("price_date").not_exists())
except ClientError as e:
if e.response['Error']['Code'] == 'ConditionalCheckFailedException':
pass
I have table already and I want to add a new attribute to that table.
I am trying to do that with the update_item functionality of dynamDB.
use case: Bid table holds details on the bid of the product. The user accepts the bid, once the bid is accepted, have to add a few attributes to that record like the user information. Not sure if it is the right way or should I have a new table for this.
pratition key is : Pickup,
sort key is : DropOff
A Demo example that I am trying currently
currently trying to alter the same table and facing the error.
import json
import boto3
def lambda_handler(event, context):
dynamo_client = boto3.resource('dynamodb')
users = dynamo_client.Table('LoadsandBids')
item = event['body']
print("Fuirst")
users.update_item(
Key={
'Pickup': event['body']['Pickup'],
'DropOff' : event['body']['DropOff']
},
UpdateExpression='SET #attr1 = :val1',
ExpressionAttributeNames={'#attr1': 'new_field'},
ExpressionAttributeValues={':val1': event['body']['new']},
ReturnValues='UPDATED_NEW'
)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
getting an error:
"An error occurred (ValidationException) when calling the UpdateItem operation: The provided key element does not match the schema",
Could anyone help me out of this and also suggest it I my approach is good or not?
I am trying to write tests for a serverless application using the AWS serverless framework. I am facing a weird issue. Whenever I try to mock S3 or DynamoDB using moto, it does not work. Instead of mocking, the boto3 call actually goes to my AWS account and tries to do things there.
This is not desirable behaviour. Could you please help?
Sample Code:
import datetime
import boto3
import uuid
import os
from moto import mock_dynamodb2
from unittest import mock, TestCase
from JobEngine.job_engine import check_duplicate
class TestJobEngine(TestCase):
#mock.patch.dict(os.environ, {'IN_QUEUE_URL': 'mytemp'})
#mock.patch('JobEngine.job_engine.logger')
#mock_dynamodb2
def test_check_duplicate(self, mock_logger):
id = 'ABCD123'
db = boto3.resource('dynamodb', 'us-east-1')
table = db.create_table(
TableName='my_table',
KeySchema=[
{
'AttributeName': 'id',
'KeyType': 'HASH'
}
],
AttributeDefinitions=[
{
'AttributeName': 'id',
'AttributeType': 'S'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 1,
'WriteCapacityUnits': 1
}
)
table.meta.client.get_waiter('table_exists').wait(TableName='my_table')
table.put_item(
Item={
'id': {'S': id},
... other data ...
}
)
res = check_duplicate(id)
self.assertTrue(mock_logger.info.called)
self.assertEqual(res, True, 'True')
Please see the above code, I am trying to insert a record into the table and then call a function that would verify if the specified id is already present in the table. Here, I get an error table already exists when I run this code.
If I disable the network, I get an error:
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://dynamodb.us-east-1.amazonaws.com/"
I fail to understand why there is an attempt to connect to AWS if we are trying to mock.
I did some digging and have finally managed to solve this.
See https://github.com/spulec/moto/issues/1793
This issue was due to some incompatibilities between boto and moto. Turns around that everything works fine when we downgrade botocore to 1.10.84