I'm running gstreamer in docker and GST_DEBUG can be used to turn on additional debugging logs, but even if I set GST_DEBUG=2 or GST_DEBUG=3, then several DEBUG logs still happen. Here's an example:
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Wrote 65524 bytes to Kinesis Video. Upload stream handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Wrote 65524 bytes to Kinesis Video. Upload stream handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Wrote 65524 bytes to Kinesis Video. Upload stream handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Wrote 31003 bytes to Kinesis Video. Upload stream handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Pausing CURL read for upload handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postWriteCallback(): Curl post body write function for stream with handle: DaveTest and upload handle: 0 returned: {"EventType":"RECEIVED","FragmentTimecode":1633701893567,"FragmentNumber":"91343852333181580945486776886085710683522911738"}
2021-10-08 14:04:55 [139759980640000] DEBUG - fragmentAckReceivedHandler invoked
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Wrote 20153 bytes to Kinesis Video. Upload stream handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Pausing CURL read for upload handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postWriteCallback(): Curl post body write function for stream with handle: DaveTest and upload handle: 0 returned: {"EventType":"BUFFERING","FragmentTimecode":1633701895543,"FragmentNumber":"91343852333181580950438537043227232143278319293"}
2021-10-08 14:04:55 [139759980640000] DEBUG - fragmentAckReceivedHandler invoked
2021-10-08 14:04:55 [139759980640000] DEBUG - postWriteCallback(): Curl post body write function for stream with handle: DaveTest and upload handle: 0 returned: {"EventType":"PERSISTED","FragmentTimecode":1633701893567,"FragmentNumber":"91343852333181580945486776886085710683522911738"}
2021-10-08 14:04:55 [139759980640000] DEBUG - fragmentAckReceivedHandler invoked
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Wrote 9598 bytes to Kinesis Video. Upload stream handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - postReadCallback(): Pausing CURL read for upload handle: 0
2021-10-08 14:04:55 [139759980640000] DEBUG - Kinesis Video client and stream metrics
>> Overall storage byte size: 536870912
>> Available storage byte size: 536261448
>> Allocated storage byte size: 609464
>> Total view allocation byte size: 144080
>> Total streams frame rate (fps): 1175
>> Total streams transfer rate (bps): 29187312 (28503 Kbps)
>> Current view duration (ms): 433
>> Overall view duration (ms): 1999
>> Current view byte size: 283686
>> Overall view byte size: 606536
>> Current frame rate (fps): 1175.58
>> Current transfer rate (bps): 29187312 (28503 Kbps)
How can I turn these off?
The issue was the kvssink plugin and that it has it's own logging setup. It can be set using a config file and log-config with the path to that file (see here)
How exactly do you set this variable? I was using it multiple times by exporting it: export GST_DEBUG=3 and it worked as it should. See this link: https://gstreamer.freedesktop.org/documentation/gstreamer/gstinfo.html?gi-language=c for info how to use it programatically.
Related
I am using Kinesis Video Streaming for a solution and I am facing some problems with few RTSP sources. To all rtsp sources I am also validating on VLC and it is working well, except for a huge delay to start to stream the video.
I am using gst-launch-1.0 to stream to Kinesis
gst-launch-1.0 -q rtspsrc location=MYRTSP_URL short-header=TRUE ! rtph264depay ! h264parse ! kvssink stream-name=MYSTREAM_NAME storage-size=128 access-key=MYACCESS_KEY secret-key=MYSECRET_KEY aws-region=us-east-1 retention-period=168
Below there is result after command execution:
log4cplus:ERROR could not open file ../kvs_log_configuration
INFO - createKinesisVideoClient(): Creating Kinesis Video Client
2022-07-04 19:37:21 [140474457073472] INFO - heapInitialize(): Initializing native heap with limit size 134217728, spill ratio 0% and flags 0x00000001
2022-07-04 19:37:21 [140474457073472] INFO - heapInitialize(): Creating AIV heap.
2022-07-04 19:37:21 [140474457073472] INFO - heapInitialize(): Heap is initialized OK
2022-07-04 19:37:21 [140474457073472] DEBUG - getSecurityTokenHandler invoked
2022-07-04 19:37:21 [140474457073472] DEBUG - Refreshing credentials. Force refreshing: 0 Now time is: 1656963441548249256 Expiration: 0
2022-07-04 19:37:21 [140474457073472] INFO - createDeviceResultEvent(): Create device result event.
2022-07-04 19:37:21 [140474457073472] DEBUG - clientReadyHandler invoked
2022-07-04 19:37:21 [140474457073472] INFO - try creating stream
2022-07-04 19:37:21 [140474457073472] INFO - Creating Kinesis Video Stream MYSTREAM_NAME
2022-07-04 19:37:21 [140474457073472] INFO - createKinesisVideoStream(): Creating Kinesis Video Stream.
2022-07-04 19:37:21 [140474457073472] INFO - logStreamInfo(): SDK version: 70f74f14cf27b09f71dc1889f36eb6e04cdd90a8
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Kinesis Video Stream Info
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Stream name: MYSTREAM_NAME
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Streaming type: STREAMING_TYPE_REALTIME
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Content type: video/h264
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Max latency (100ns): 600000000
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Fragment duration (100ns): 20000000
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Key frame fragmentation: Yes
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Use frame timecode: Yes
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Absolute frame timecode: Yes
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Nal adaptation flags: 0
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Average bandwith (bps): 4194304
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Framerate: 25
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Buffer duration (100ns): 1200000000
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Replay duration (100ns): 400000000
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Connection Staleness duration (100ns): 600000000
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Store Pressure Policy: 1
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): View Overflow Policy: 1
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Segment UUID: NULL
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Frame ordering mode: 0
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Track list
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Track id: 1
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Track name: kinesis_video
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Codec id: V_MPEG4/ISO/AVC
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Track type: TRACK_INFO_TYPE_VIDEO
2022-07-04 19:37:21 [140474457073472] DEBUG - logStreamInfo(): Track cpd: NULL
2022-07-04 19:37:21 [140474457073472] INFO - writeHeaderCallback(): RequestId: 9165f977-7fe3-413b-9b22-6ac81a5f7e8a
2022-07-04 19:37:22 [140474293741120] DEBUG - describeStreamCurlHandler(): DescribeStream API response: {"StreamInfo":{"CreationTime":1.656961857643E9,"DataRetentionInHours":168,"DeviceName":"Kinesis_Video_Device","IngestionConfiguration":null,"KmsKeyId":"arn:aws:kms:us-east-1:MYACCOUNT_NUMBER:alias/aws/kinesisvideo","MediaType":"video/h264","Status":"ACTIVE","StreamARN":"arn:aws:kinesisvideo:us-east-1:MYACCOUNT_NUMBER:stream/MYSTREAM_NAME/1656961857643","StreamName":"MYSTREAM_NAME","Version":"MYVERSION"}}
2022-07-04 19:37:22 [140474293741120] INFO - describeStreamResultEvent(): Describe stream result event.
2022-07-04 19:37:22 [140474293741120] INFO - writeHeaderCallback(): RequestId: 65213fea-917f-4d14-88d6-fb854d3a08cd
2022-07-04 19:37:22 [140474285348416] DEBUG - getStreamingEndpointCurlHandler(): GetStreamingEndpoint API response: {"DataEndpoint":"https://AAAAAAA.kinesisvideo.us-east-1.amazonaws.com"}
2022-07-04 19:37:22 [140474285348416] INFO - getStreamingEndpointResultEvent(): Get streaming endpoint result event.
2022-07-04 19:37:22 [140474285348416] DEBUG - getStreamingTokenHandler invoked
2022-07-04 19:37:22 [140474285348416] DEBUG - Refreshing credentials. Force refreshing: 1 Now time is: 1656963442867748518 Expiration: 18446744073709551615
2022-07-04 19:37:22 [140474285348416] INFO - getStreamingTokenResultEvent(): Get streaming token result event.
2022-07-04 19:37:22 [140474285348416] DEBUG - streamReadyHandler invoked
2022-07-04 19:37:22 [140474285348416] Stream is ready
INFO - kinesisVideoStreamFormatChanged(): Stream format changed.
DEBUG - Dropping frame with flag: 97920:02:08.5 / 99:99:99.
2022-07-04 19:39:31 [140473742124608] INFO - putStreamResultEvent(): Put stream result event. New upload handle 0
INFO - writeHeaderCallback(): RequestId: dc9a7e24-b124-5e42-87a7-3a109b2af291
2022-07-04 19:39:32 [140473750517312] DEBUG - Dropping frame with flag: 1536
DEBUG - postReadCallback(): Pausing CURL read for upload handle: 0
DEBUG - Dropping frame with flag: 15360:02:10.0 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:10.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:11.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:12.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:13.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:14.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:15.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:16.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:17.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:18.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:19.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:20.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:21.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:22.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:23.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:24.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:25.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:26.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:27.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:28.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:29.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:30.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:31.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:32.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:33.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:34.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:35.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:36.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:37.5 / 99:99:99.
WARN - curlCompleteSync(): curl perform failed for url https://AAAAAAA.kinesisvideo.us-east-1.amazonaws.com/putMedia with result Timeout was reached: Operation too slow. Less than 30 bytes/sec transferred the last 30 seconds
2022-07-04 19:40:02 [140473750517312] WARN - curlCompleteSync(): HTTP Error 0 : Response: (null)
Request URL: https://AAAAAAA.kinesisvideo.us-east-1.amazonaws.com/putMedia
Request Headers:
Authorization: AWS4-HMAC-SHA256 Credential=MYACCESS_KEY/20220704/us-east-1/kinesisvideo/aws4_request, SignedHeaders=connection;host;transfer-encoding;user-agent;x-amz-date;x-amzn-fragment-acknowledgment-required;x-amzn-fragment-timecode-type;x-amzn-producer-start-timestamp;x-amzn-stream-name, Signature=4c548d1966bfc9d90980b92b71409750a85cd756ed23271
2022-07-04 19:40:02 [140473750517312] DEBUG - putStreamCurlHandler(): Network thread for Kinesis Video stream: MYSTREAM_NAME with upload handle: 0 exited. http status: 0
2022-07-04 19:40:02 [140473750517312] WARN - putStreamCurlHandler(): Stream with streamHandle 94086962655504 uploadHandle 0 has exited without triggering end-of-stream. Service call result: 599
2022-07-04 19:40:02 [140473750517312] INFO - kinesisVideoStreamTerminated(): Stream 0x559253fcb110 terminated upload handle 0 with service call result 599.
2022-07-04 19:40:02 [140473750517312] DEBUG - defaultStreamStateTransitionHook(): Stream state machine retry count: 0
2022-07-04 19:40:02 [140473750517312] DEBUG - defaultStreamStateTransitionHook():
KinesisVideoStream base result is [599]. Executing KVS retry handler of retry strategy type [1]
DEBUG - Dropping frame with flag: 15360:02:39.4 / 99:99:99.
2022-07-04 19:40:02 [140473742124608] DEBUG - defaultStreamStateTransitionHook(): Stream state machine retry count: 1
2022-07-04 19:40:02 [140473742124608] DEBUG - defaultStreamStateTransitionHook():
KinesisVideoStream base result is [599]. Executing KVS retry handler of retry strategy type [1]
2022-07-04 19:40:02 [140473742124608] DEBUG - streamReadyHandler invoked
2022-07-04 19:40:02 [140473742124608] DEBUG - defaultStreamStateTransitionHook(): Stream state machine retry count: 2
2022-07-04 19:40:02 [140473742124608] DEBUG - defaultStreamStateTransitionHook():
KinesisVideoStream base result is [599]. Executing KVS retry handler of retry strategy type [1]
2022-07-04 19:40:02 [140473742124608] INFO - putStreamResultEvent(): Put stream result event. New upload handle 1
DEBUG - Dropping frame with flag: 15360:02:39.6 / 99:99:99.
INFO - writeHeaderCallback(): RequestId: dc2f1583-c74c-3c15-8712-51d03e572d8f
DEBUG - postReadCallback(): Pausing CURL read for upload handle: 1
DEBUG - Dropping frame with flag: 15360:02:41.4 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:41.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:42.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:43.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:44.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:45.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:46.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:47.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:48.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:49.6 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:50.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:51.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:52.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:53.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:54.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:55.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:56.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:57.5 / 99:99:99.
DEBUG - Dropping frame with flag: 15360:02:58.5 / 99:99:99.
After receive this output I closed it using CTRL+C.
Anyone knows what can be done to solve this problem?
Thanks in advance.
Solved by replacing /cam/realmonitor?channel=1&subtype=0 by /cam/realmonitor?channel=1&subtype=0 on RTSP, on Linux & is a reserved character.
How do I verify my bookmarks are working? I find that when I run a job immediately after the previous finishes, it seem to still take a long time. Why is that? I thought it will not read the files it already processed? The script looks like below:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
## #params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
inputGDF = glueContext.create_dynamic_frame_from_options(connection_type = "s3", connection_options = {"paths": ["s3://xxx-glue/testing-csv"], "recurse": True}, format = "csv", format_options = {"withHeader": True}, transformation_ctx="inputGDF")
if bool(inputGDF.toDF().head(1)):
print("Writing ...")
inputGDF.toDF() \
.drop("createdat") \
.drop("updatedat") \
.write \
.mode("append") \
.partitionBy(["querydestinationplace", "querydatetime"]) \
.parquet("s3://xxx-glue/testing-parquet")
else:
print("Nothing to write ...")
job.commit()
import boto3
glue_client = boto3.client('glue', region_name='ap-southeast-1')
glue_client.start_crawler(Name='xxx-testing-partitioned')
The log looks like:
18/12/11 14:49:03 INFO Client: Application report for application_1544537674695_0001 (state: RUNNING)
18/12/11 14:49:03 DEBUG Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 172.31.2.72
ApplicationMaster RPC port: 0
queue: default
start time: 1544539297014
final status: UNDEFINED
tracking URL: http://ip-172-31-0-204.ap-southeast-1.compute.internal:20888/proxy/application_1544537674695_0001/
user: root
18/12/11 14:49:04 INFO Client: Application report for application_1544537674695_0001 (state: RUNNING)
18/12/11 14:49:04 DEBUG Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 172.31.2.72
ApplicationMaster RPC port: 0
queue: default
start time: 1544539297014
final status: UNDEFINED
tracking URL: http://ip-172-31-0-204.ap-southeast-1.compute.internal:20888/proxy/application_1544537674695_0001/
user: root
18/12/11 14:49:05 INFO Client: Application report for application_1544537674695_0001 (state: RUNNING)
18/12/11 14:49:05 DEBUG Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 172.31.2.72
ApplicationMaster RPC port: 0
queue: default
start time: 1544539297014
final status: UNDEFINED
tracking URL: http://ip-172-31-0-204.ap-southeast-1.compute.internal:20888/proxy/application_1544537674695_0001/
user: root
...
18/12/11 14:42:00 INFO NewHadoopRDD: Input split: s3://pinfare-glue/testing-csv/2018-09-25/DPS/2018-11-15_2018-11-19.csv:0+1194081
18/12/11 14:42:00 INFO S3NativeFileSystem: Opening 's3://pinfare-glue/testing-csv/2018-09-25/DPS/2018-11-14_2018-11-18.csv' for reading
18/12/11 14:42:00 INFO S3NativeFileSystem: Opening 's3://pinfare-glue/testing-csv/2018-09-25/DPS/2018-11-15_2018-11-19.csv' for reading
18/12/11 14:42:00 INFO Executor: Finished task 89.0 in stage 0.0 (TID 89). 2088 bytes result sent to driver
18/12/11 14:42:00 INFO CoarseGrainedExecutorBackend: Got assigned task 92
18/12/11 14:42:00 INFO Executor: Running task 92.0 in stage 0.0 (TID 92)
18/12/11 14:42:00 INFO NewHadoopRDD: Input split: s3://pinfare-glue/testing-csv/2018-09-25/DPS/2018-11-16_2018-11-20.csv:0+1137753
18/12/11 14:42:00 INFO Executor: Finished task 88.0 in stage 0.0 (TID 88). 2088 bytes result sent to driver
18/12/11 14:42:00 INFO CoarseGrainedExecutorBackend: Got assigned task 93
18/12/11 14:42:00 INFO Executor: Running task 93.0 in stage 0.0 (TID 93)
18/12/11 14:42:00 INFO NewHadoopRDD: Input split: s3://pinfare-glue/testing-csv/2018-09-25/DPS/2018-11-17_2018-11-21.csv:0+1346626
18/12/11 14:42:00 INFO S3NativeFileSystem: Opening 's3://pinfare-glue/testing-csv/2018-09-25/DPS/2018-11-16_2018-11-20.csv' for reading
18/12/11 14:42:00 INFO S3NativeFileSystem: Opening 's3://pinfare-glue/testing-csv/2018-09-25/DPS/2018-11-17_2018-11-21.csv' for reading
18/12/11 14:42:00 INFO Executor: Finished task 90.0 in stage 0.0 (TID 90). 2088 bytes result sent to driver
18/12/11 14:42:00 INFO Executor: Finished task 91.0 in stage 0.0 (TID 91). 2088 bytes result sent to driver
18/12/11 14:42:00 INFO CoarseGrainedExecutorBackend: Got assigned task 94
18/12/11 14:42:00 INFO CoarseGrainedExecutorBackend: Got assigned task 95
18/12/11 14:42:00 INFO Executor: Running task 95.0 in stage 0.0 (TID 95)
18/12/11 14:42:00 INFO Executor: Running task 94.0 in stage 0.0 (TID 94)
... I notice the parquet is appended with alot of duplicate data ... Is the bookmark not working? Its already enabled
Bookmarking Requirements
From the docs
Job must be created with --job-bookmark-option job-bookmark-enable (or if using the console then in the console options). Job must also have a jobname; this will be passed in automatically.
Job must start with a job.init(jobname)
e.g.
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
Job must have a job.commit() to save the state of the bookmark and finish successfully.
The datasource must be either s3 source or JDBC (limited, and not your usecase so I will ignore it).
The example in the docs shows creating a dynamicframe from the (Glue/Lake formation) catalog using the tablename, not an explicit S3 path. This implies that reading from the catalog is still considered an S3 source; the underlying files will be on S3.
Files on s3 must be one of JSON, CSV, Apache Avro, XML for version 0.9 and above, or can be Parquet or ORC for version 1.0 and above
The datasource in the script must have a transformation_ctx parameter.
The docs say
pass the transformation_ctx parameter only to those methods that you
want to enable bookmarks
You could add this to every transform for saving state but the critical one(s) are the datasource(s) you want to bookmark.
Troubleshooting
From the docs
Max concurrency must be 1. Higher values break bookmarking
It also mentions job.commit() and using the transformation_ctx as above
For Amazon S3 input sources, job bookmarks check the last modified
time of the objects, rather than the file names, to verify which
objects need to be reprocessed. If your input source data has been
modified since your last job run, the files are reprocessed when you
run the job again.
Other things to check
Have you verified that your CSV files in the path "s3://xxx-glue/testing-csv" do not already contain duplicates? You could use a Glue crawler or write DDL in Athena to create a table over them and look directly. Alternatively create a dev endpoint and run a zeppelin or sagemaker notebook and step through your code.
It doesn't mention anywhere that editing your script would reset your state, however, if you modified the transformation_ctx of the datasource or other stages then that would likely impact the state, however I haven't verified that. The job has a Jobname which keys the state, along with the run number, attempt number and version number that are used to manage retries and the latest state, which implies that minor changes to the script wouldn't affect the state as long as the Jobname is consistent, but again I haven't verified that.
As an aside, in your code you test for inputGDF.toDF().head(1) and then run inputGDF.toDF()... to write the data. Spark is lazily evaluated but in that case you are running an equivalent dynamicframe to dataframe twice, and spark can't cache or reuse it. Better to do something like df = inputGDF.toDF() before the if and then reuse the df twice.
Please check this doc about AWS Glue bookmarking mechanism.
Basically it requires to enable it via Console (or CloudFormation) and specify tranformation_context parameter which uses together with some other attributes (like job name, source file names) to save checkpointing information. If you change value of one of these attributes then Glue will treat it as different checkpoint.
https://docs.aws.amazon.com/glue/latest/dg/monitor-debug-multiple.html can be used to verify if bookmark is working or not
Bookmarks are not supported for parquet format in Glue version 0.9:
They are supported in Glue version 1.0 though.
Just for the record, and since there are no answers yet.
I think editing the script seem to affect the bookmarks ... but I thought it should not ...
This is my first BOSH installation for PKS.
Environment:
vSphere 6.5 with VCSA 6.5u2,
OpsMgr 2.2 build 296
bosh stemcell vsphere-ubuntu-trusty build 3586.25
Using a flat 100.x network, no routing/firewall involved.
Summary - After deploying the OpsMgr OVF template, I'm configuring and installing BOSH Director.
However, it fails at "Waiting for Agent" in the dashboard.
A look at the 'current' log in the OpsMgr VM shows that it keeps trying to read settings from /dev/sr0, because the agent.json specifies settings Source as CDROM.
It cannot find any CDROM, so it fails.
A few questions:
How do I login to the VM that BOSH creates when I change the
setting to "default BOSH password" for all VMs in Ops Mgr?
There is no bosh.yml under
/var/tempest/workspaces/default/deployments.
Some docs point to it. So I don't know what settings its applying. Is
the location wrong?
Is there a way to change the stemcell used by the OpsMgr VM? Maybe I cantry
using the previous build?
How is the agent.json actually populated?
Any suggestions on troubleshooting this?
All logs/jsons below:
the GUI dashboard log:
===== 2018-07-30 08:20:52 UTC Running "/usr/local/bin/bosh --no-color --non-interactive --tty create-env /var/tempest/workspaces/default/deployments/bosh.yml"
Deployment manifest: '/var/tempest/workspaces/default/deployments/bosh.yml'
Deployment state: '/var/tempest/workspaces/default/deployments/bosh-state.json'
Started validating
Validating release 'bosh'... Finished (00:00:00)
Validating release 'bosh-vsphere-cpi'... Finished (00:00:00)
Validating release 'uaa'... Finished (00:00:00)
Validating release 'credhub'... Finished (00:00:01)
Validating release 'bosh-system-metrics-server'... Finished (00:00:01)
Validating release 'os-conf'... Finished (00:00:00)
Validating release 'backup-and-restore-sdk'... Finished (00:00:04)
Validating release 'bpm'... Finished (00:00:02)
Validating cpi release... Finished (00:00:00)
Validating deployment manifest... Finished (00:00:00)
Validating stemcell... Finished (00:00:14)
Finished validating (00:00:26)
Started installing CPI
Compiling package 'ruby-2.4-r4/0cdc60ed7fdb326e605479e9275346200af30a25'... Finished (00:00:00)
Compiling package 'vsphere_cpi/e1a84e5bd82eb1abfe9088a2d547e2cecf6cf315'... Finished (00:00:00)
Compiling package 'iso9660wrap/82cd03afdce1985db8c9d7dba5e5200bcc6b5aa8'... Finished (00:00:00)
Installing packages... Finished (00:00:15)
Rendering job templates... Finished (00:00:06)
Installing job 'vsphere_cpi'... Finished (00:00:00)
Finished installing CPI (00:00:23)
Starting registry... Finished (00:00:00)
Uploading stemcell 'bosh-vsphere-esxi-ubuntu-trusty-go_agent/3586.25'... Skipped [Stemcell already uploaded] (00:00:00)
Started deploying
Waiting for the agent on VM 'vm-87b3299a-a994-4544-8043-032ce89d685b'... Failed (00:00:11)
Deleting VM 'vm-87b3299a-a994-4544-8043-032ce89d685b'... Finished (00:00:10)
Creating VM for instance 'bosh/0' from stemcell 'sc-536fea79-cfa6-46a9-a53e-9de19505216f'... Finished (00:00:12)
Waiting for the agent on VM 'vm-fb90eee8-f3ac-45b7-95d3-4e8483c91a5c' to be ready... Failed (00:09:59)
Failed deploying (00:10:38)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)
Deploying:
Creating instance 'bosh/0':
Waiting until instance is ready:
Post https://vcap:<redacted>#192.168.100.201:6868/agent: dial tcp 192.168.100.201:6868: connect: no route to host
Exit code 1
===== 2018-07-30 08:32:20 UTC Finished "/usr/local/bin/bosh --no-color --non-interactive --tty create-env /var/tempest/workspaces/default/deployments/bosh.yml"; Duration: 688s; Exit Status: 1
Exited with 1.
The bosh_state.json
ubuntu#opsmanager-2-2:~$ sudo cat /var/tempest/workspaces/default/deployments/bosh-state.json
{
"director_id": "851f70ef-7c4b-4c65-73ed-d382ad3df1b7",
"installation_id": "f29df8af-7141-4aff-5e52-2d109a84cd84",
"current_vm_cid": "vm-87b3299a-a994-4544-8043-032ce89d685b",
"current_stemcell_id": "dcca340c-d612-4098-7c90-479193fa9090",
"current_disk_id": "",
"current_release_ids": [],
"current_manifest_sha": "",
"disks": null,
"stemcells": [
{
"id": "dcca340c-d612-4098-7c90-479193fa9090",
"name": "bosh-vsphere-esxi-ubuntu-trusty-go_agent",
"version": "3586.25",
"cid": "sc-536fea79-cfa6-46a9-a53e-9de19505216f"
}
],
"releases": []
The agent.json
ubuntu#opsmanager-2-2:~$ sudo cat /var/vcap/bosh/agent.json
{
"Platform": {
"Linux": {
"DevicePathResolutionType": "scsi"
}
},
"Infrastructure": {
"Settings": {
"Sources": [
{
"Type": "CDROM",
"FileName": "env"
}
]
}
}
}
ubuntu#opsmanager-2-2:~$
Finally, the current BOSH log
/var/vcap/bosh/log/current
2018-07-30_08:42:22.69934 [main] 2018/07/30 08:42:22 DEBUG - Starting agent
2018-07-30_08:42:22.69936 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/agent.json
2018-07-30_08:42:22.69937 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69937 ********************
2018-07-30_08:42:22.69938 {
2018-07-30_08:42:22.69938 "Platform": {
2018-07-30_08:42:22.69939 "Linux": {
2018-07-30_08:42:22.69939
2018-07-30_08:42:22.69939 "DevicePathResolutionType": "scsi"
2018-07-30_08:42:22.69939 }
2018-07-30_08:42:22.69939 },
2018-07-30_08:42:22.69939 "Infrastructure": {
2018-07-30_08:42:22.69940 "Settings": {
2018-07-30_08:42:22.69940 "Sources": [
2018-07-30_08:42:22.69940 {
2018-07-30_08:42:22.69940 "Type": "CDROM",
2018-07-30_08:42:22.69940 "FileName": "env"
2018-07-30_08:42:22.69940 }
2018-07-30_08:42:22.69941 ]
2018-07-30_08:42:22.69941 }
2018-07-30_08:42:22.69941 }
2018-07-30_08:42:22.69941 }
2018-07-30_08:42:22.69941
2018-07-30_08:42:22.69941 ********************
2018-07-30_08:42:22.69943 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/etc/stemcell_version
2018-07-30_08:42:22.69944 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69944 ********************
2018-07-30_08:42:22.69944 3586.25
2018-07-30_08:42:22.69944 ********************
2018-07-30_08:42:22.69945 [File System] 2018/07/30 08:42:22 DEBUG - Reading file /var/vcap/bosh/etc/stemcell_git_sha1
2018-07-30_08:42:22.69946 [File System] 2018/07/30 08:42:22 DEBUG - Read content
2018-07-30_08:42:22.69946 ********************
2018-07-30_08:42:22.69946 dbbb73800373356315a4c16ee40d2db3189bf2db
2018-07-30_08:42:22.69947 ********************
2018-07-30_08:42:22.69948 [App] 2018/07/30 08:42:22 INFO - Running on stemcell version '3586.25' (git: dbbb73800373356315a4c16ee40d2db3189bf2db)
2018-07-30_08:42:22.69949 [File System] 2018/07/30 08:42:22 DEBUG - Checking if file exists /var/vcap/bosh/agent_state.json
2018-07-30_08:42:22.69950 [File System] 2018/07/30 08:42:22 DEBUG - Stat '/var/vcap/bosh/agent_state.json'
2018-07-30_08:42:22.69951 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Running command 'bosh-agent-rc'
2018-07-30_08:42:22.70116 [unlimitedRetryStrategy] 2018/07/30 08:42:22 DEBUG - Making attempt #0
2018-07-30_08:42:22.70117 [DelayedAuditLogger] 2018/07/30 08:42:22 DEBUG - Starting logging to syslog...
2018-07-30_08:42:22.70181 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Stdout:
2018-07-30_08:42:22.70182 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Stderr:
2018-07-30_08:42:22.70183 [Cmd Runner] 2018/07/30 08:42:22 DEBUG - Successful: true (0)
2018-07-30_08:42:22.70184 [settingsService] 2018/07/30 08:42:22 DEBUG - Loading settings from fetcher
2018-07-30_08:42:22.70185 [ConcreteUdevDevice] 2018/07/30 08:42:22 DEBUG - Kicking device, attempt 0 of 5
2018-07-30_08:42:22.70187 [ConcreteUdevDevice] 2018/07/30 08:42:22 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:23.20204 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - Kicking device, attempt 1 of 5
2018-07-30_08:42:23.20206 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:23.70217 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - Kicking device, attempt 2 of 5
2018-07-30_08:42:23.70220 [ConcreteUdevDevice] 2018/07/30 08:42:23 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:24.20229 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - Kicking device, attempt 3 of 5
2018-07-30_08:42:24.20294 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:24.70249 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - Kicking device, attempt 4 of 5
2018-07-30_08:42:24.70253 [ConcreteUdevDevice] 2018/07/30 08:42:24 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20317 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20320 [ConcreteUdevDevice] 2018/07/30 08:42:25 ERROR - Failed to red byte from device: open /dev/sr0: no such file or directory
2018-07-30_08:42:25.20321 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Settling UdevDevice
2018-07-30_08:42:25.20322 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Running command 'udevadm settle'
2018-07-30_08:42:25.20458 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Stdout:
2018-07-30_08:42:25.20460 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Stderr:
2018-07-30_08:42:25.20461 [Cmd Runner] 2018/07/30 08:42:25 DEBUG - Successful: true (0)
2018-07-30_08:42:25.20462 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ensuring Device Readable, Attempt 0 out of 5
2018-07-30_08:42:25.20463 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.20464 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:25.70473 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ensuring Device Readable, Attempt 1 out of 5
2018-07-30_08:42:25.70476 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:25.70477 [ConcreteUdevDevice] 2018/07/30 08:42:25 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:26.20492 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ensuring Device Readable, Attempt 2 out of 5
2018-07-30_08:42:26.20496 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:26.20497 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:26.70509 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ensuring Device Readable, Attempt 3 out of 5
2018-07-30_08:42:26.70512 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:26.70513 [ConcreteUdevDevice] 2018/07/30 08:42:26 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.20530 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - Ensuring Device Readable, Attempt 4 out of 5
2018-07-30_08:42:27.20533 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:27.20534 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - Ignorable error from readByte: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70554 [ConcreteUdevDevice] 2018/07/30 08:42:27 DEBUG - readBytes from file: /dev/sr0
2018-07-30_08:42:27.70557 [settingsService] 2018/07/30 08:42:27 ERROR - Failed loading settings via fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70559 [settingsService] 2018/07/30 08:42:27 ERROR - Failed reading settings from file Opening file /var/vcap/bosh/settings.json: open /var/vcap/bosh/settings.json: no such file or directory
2018-07-30_08:42:27.70560 [main] 2018/07/30 08:42:27 ERROR - App setup Running bootstrap: Fetching settings: Invoking settings fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.70561 [main] 2018/07/30 08:42:27 ERROR - Agent exited with error: Running bootstrap: Fetching settings: Invoking settings fetcher: Getting settings from all sources: Reading files from CDROM: Waiting for CDROM to be ready: Reading udev device: open /dev/sr0: no such file or directory
2018-07-30_08:42:27.71258 [main] 2018/07/30 08:42:27 DEBUG - Starting agent
<and this whole block just keeps repeating>
How do I login to the VM that BOSH creates when I change the setting to "default BOSH password" for all VMs in Ops Mgr?
That's not a good idea. The default password is well-known and you should almost always use randomly generated passwords. I'm not honestly sure why that's even an option. The only thing that comes to mind might be some extremely rare troubleshooting scenario.
That said, you can securely obtain the randomly generated password through Ops Manager, if you need to access the VM manually. You can also securely access VMs via bosh ssh, and credentials are handled automatically. Even for troubleshooting, you don't usually need that option.
There is no bosh.yml under /var/tempest/workspaces/default/deployments. Some docs point to it. So I don't know what settings its applying. Is the location wrong?
The location is correct but the file contains sensitive information so Ops Manager deletes it immediately after it's done being used.
If you want to see the contents of the file, the easy way is to navigate to https://ops-man-fqdn/debug/files and you can see all of the configuration files, including your bosh.yml. The hard way is to watch the folder above while a deploy is going on and you'll see the file exist for a short period of time. You can make a copy during that window. The only advantage to the hard way is that you'll get the actual file, whereas the debug endpoint shows a file with sensitive info redacted.
Is there a way to change the stemcell used by the OpsMgr VM? Maybe I cantry using the previous build?
I don't think this is an issue with the stemcell. There are lots of people using those and not having this issue. If a larger issue like this were found with a stemcell, you would see a notice up on Pivotal Network and Pivotal would publish a new, fixed stemcell.
The problem also seems to be with how the VM is receiving it's initial bootstrap configuration. I'd suggest looking into that more before messing with the stemcells. See below.
How is the agent.json actually populated?
Believe it or not, for vSphere environments, that file is read from a fake CD-ROM that's attached to the VM. There's not a lot documented, but it's mentioned briefly in the BOSH docs here.
https://bosh.io/docs/cpi-api-v1-method/create-vm/#agent-settings
Any suggestions on troubleshooting this?
Look to understand why the CD-ROM can't be mounted. BOSH needs that to get it's bootstrap configuration, so you need to make that work. If there is something in your vSphere environment that is preventing the CD-ROM from being mounted, you'll need to modify it to allow the CD-ROM to be mounted.
If there's nothing on the vSphere side, I think the next step would be to check the standard system logs under /var/log and dmesg output to see if there are any errors or clues as to why the CD-ROM can't be loaded/read from.
Lastly, try doing some manual tests to mount & read from the CD-ROM. Start by looking at one of the BOSH deployed VMs in the vSphere client, look at the hardware settings and make sure there is a CD-ROM attached. It should point to a file called env.iso in the same folder as the VM on your datastore. If that's attached & connected, start up the VM and try to mount the CD-ROM. You should be able to see the BOSH config files on that drive.
Hope that helps!
Old thread but maybe it will help someone, there's a firewall in vCenter that will prevent the agent from talking to the Bosh director.
I am trying to use AWS CodeDeploy. I use aws deploy push --debug command. The file to be uploaded is around 250 KB. But upload doesn't finish. Following is the logs displayed.
2017-10-27 11:11:40,601 - MainThread - botocore.auth - DEBUG - CanonicalRequest:
PUT
/frontend-deployer/business-services-0.0.1-SNAPSHOT-classes.jar
partNumber=39&uploadId=.olvaJkxreDZf1ObaHCMtHmkQ5DFE.uZ9Om0sxZB08YG3tqRWBxmGLTFWSYQaj9mHl26LPJk..Stv_vPB5NMaV.zAqsYX6fZz_S3.uN5J4FlxHZFXoeTkMiBSYQB2C.g
content-md5:EDXgvJ8Tt5tHYZ6Nkh7epg==
host:s3.us-east-2.amazonaws.com
x-amz-content-sha256:UNSIGNED-PAYLOAD
x-amz-date:20171027T081140Z
content-md5;host;x-amz-content-sha256;x-amz-date
UNSIGNED-PAYLOAD
...
2017-10-27 11:12:12,035 - MainThread - botocore.endpoint - DEBUG - Sending http request: <PreparedRequest [PUT]>
2017-10-27 11:12:12,035 - MainThread - botocore.awsrequest - DEBUG - Waiting for 100 Continue response.
2017-10-27 11:12:12,189 - MainThread - botocore.awsrequest - DEBUG - 100 Continue response seen, now sending request body.
Even though the file is fairly small (250 KB), upload doesn't finish.
On the other hand, upload via aws s3 cp command lasts 1 second.
How can I increase the upload speed in aws deploy push command?
My Elastic Beanstalk environment is stopping streaming node.js events to CloudWatch Logs. Streaming works fine for a view minutes on a new instance. After a view minutes no more logs show up in CloudWatch.
I set up AWS Elastic Beanstalk to stream logs to CloudWatch under Configuration > Software Configuration > CloudWatch Logs > Log Streaming (true). I deactivated log streaming and reactivated it as a test. Taking a look at cloudwatch
Last eb-activity log is about 10 minutes old
Error log is not available (on neither of the instances)
nginx/access.log is a view seconds old
nodejs.log is about an hour old (short after relaunching instance)
Every health check writes an log entry every view seconds into nodejs.log though.
I did not find any logs on the ec2 instance regarding log streaming.
Has anyone similar issues?
How do I make Elastic Beanstalk stream nodejs logs to CloudWatch logs.
--- EDIT
[ec2-user#ip-###-##-##-## log]$ cat /var/log/awslogs.log
2017-03-07 11:01:05,928 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Detected file rotation, notifying reader
2017-03-07 11:01:05,928 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Reader is still alive.
2017-03-07 11:01:05,928 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:05,928 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:06,052 - cwlogs.push.reader - INFO - 31861 - Thread-8 - No data is left. Reader is leaving.
2017-03-07 11:01:10,929 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Removing dead reader [2177a5cce5ed29525de329bfdc292ff1, /var/log/nginx/access.log]
2017-03-07 11:01:10,929 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Starting reader for [92257964a10edeb586f084f4f2ba35de, /var/log/nginx/access.log]
2017-03-07 11:01:10,930 - cwlogs.push.reader - INFO - 31861 - Thread-11 - Start reading file from 0.
2017-03-07 11:01:10,930 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:10,930 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:15,931 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:15,931 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:16,788 - cwlogs.push.publisher - INFO - 31861 - Thread-7 - Log group: /aws/elasticbeanstalk/production/var/log/nginx/access.log, log stream: i-0bd24767864801e2c, queue size: 0, Publish batch: {'skipped_events_count': 0, 'first_event': {'timestamp': 1488884470930, 'start_position': 0L, 'end_position': 114L}, 'fallback_events_count': 0, 'last_event': {'timestamp': 1488884472931, 'start_position': 341L, 'end_position': 454L}, 'source_id': '92257964a10edeb586f084f4f2ba35de', 'num_of_events': 4, 'batch_size_in_bytes': 554}
2017-03-07 11:01:20,932 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:20,932 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:25,933 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:25,933 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:27,881 - cwlogs.push.publisher - INFO - 31861 - Thread-7 - Log group: /aws/elasticbeanstalk/production/var/log/nginx/access.log, log stream: i-0bd24767864801e2c, queue size: 0, Publish batch: {'skipped_events_count': 0, 'first_event': {'timestamp': 1488884481933, 'start_position': 454L, 'end_position': 568L}, 'fallback_events_count': 0, 'last_event': {'timestamp': 1488884482934, 'start_position': 568L, 'end_position': 681L}, 'source_id': '92257964a10edeb586f084f4f2ba35de', 'num_of_events': 2, 'batch_size_in_bytes': 277}
When Andrew (#andrew-ferk) and myself activated log streaming, it created all the log groups and streams in CloudWatch with the current log. After we deployed again, we noticed the logs stopped. This is because aws hashes the first line of the log. If it has seen that hash before it will treat that file like it's already been processed
If you are using npm start the first lines will be your application's name with version.
You can add a CMD date && npm start to your dockerfile to trigger a different first line each time or run npm in silent mode (as long as your first output is unique).
Also according to their docs you should add some policy to your elastic-beanstalk before enabling the feature AWS-Docs
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:GetLogEvents",
"logs:PutLogEvents",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:PutRetentionPolicy"
],
"Resource": [
"*"
]
}
]
}
The following FAQs might be helpful:
CloudWatch Logs Agent FAQs
Why can’t I push log data to CloudWatch Logs with the awslogs agent?
Some things to check if you are streaming custom log files:
eb ssh into the instance and look at /var/log/awslogs.log. If that doesn't even mention "Loading additional configs from (your awslogs config file)", make sure you are installing your config file correct as well as restarting the awslogs service after installing it (presumably using .ebextensions. See "Custom Log File Streaming" in Using Elastic Beanstalk with Amazon CloudWatch Logs. See the commands section in logs-streamtocloudwatch-linux.config for how to restart the awslogs service.
The CloudWatch Logs Agent is stateful. If the first few lines of your log file are blank or never change, you may need to set file_fingerprint_lines. See CloudWatch Logs Agent Reference.