Our OR instance seems to be broken.
We have S3 external storage enabled. However, for the past week, no data is being sent to S3. This is verified by
• All sessions during that time frame exhibiting the This session is still being processed. Please check it again in a few minutes
error.
• S3 bucket size has not changed during that time.
Preceding to that breaking point, there was a lot of data cleanup activities on the OR server. Deleting files from the /openreplay/storage/nfs
directory due to fast build up of files (canvas images) and the efs
cleanup not able to keep up. Attempts to upgrade the OR server was also done during this time with zero success. I highly suspect the combination of these activities resulted in the current bad state.
Current state -
• OR server endpoints continue to work, from the browser perspective web/i
web/images
calls all seem file.
• The sessions are recorded in internal postgres and show up in the OR search UI.
• Calls to external storage (s3) to fetch dom.mobs result in 404. None of the sessions are replayable.
• /openreplay/storage/nfs
is entirely empty (Not sure why. It was overwhelmed with jpeg assets from Canvas recording)
• I see a bunch of errors in sink
:
/root/cmd/sink/main.go:65: writer error: can’t create session: 2871247767997554893, err: open /mnt/efs/2871247767997554893: permission denied