By default in Sitecore when we import Items in Bucket they are created in hierarchy by either the GUID or the Datetime of creation.
Is there a way to create a flat structure in the Bucket and have no hierarchy?
Would having a flat bucket structure affects the way Sitecore search for bucket works?
The whole idea of using the item buckets is to avoid having a large number of items physically stored under the same parent item. However, you can still have a flat list of items if you want by setting the BucketConfiguration.BucketFolderPath to blank or a given name. This key is located in the /App_Config/Include/Sitecore.Buckets.Config.
I copied here the comments written above the BucketConfiguration.BucketFolderPath settings:
This setting determines the folder structure that is created in the
content tree. Edit this setting to change the folder structure.
The format currently supports date formatting, names, for example, "Content Bucket" or blank. Blank creates a dummy called
"Repository".
Related
I am new to Amazon AWS S3.
One of my applications processes 40000 updates an hour with a unique identifier for each update.
This identifier is basically a string.
At runtime, I want to store the ID in an S3 bucket for all updates.
But, as far as I understood, we need to store files in s3.
Is there anyway around this?
Should I store a file.. Then read that file each time..append the name and store it again?
Any direction would be very helpful.
Thanks in advance.
I want it to be stored like:
Id1
Id2
Id3
.
.
,
.
Edit: Thanks for the responses, I have added what is asked..
I want to be able to just fetch all these IDs if and when a problem occurs in our system.
I am open to using anything other than s3 as well. I was also looking into DynamoDB. With the ID as the primary key. But, these ID's might be repetitive in 1-2% cases.
In S3, you do not have concept of files and folders. All you have is a bucket and objects inside the bucket. However, the UI of AWS groups objects with common prefixes such that they appear to be in the same folder.
Also, there is nothing like appending to a file in S3. Since S3 has objects, what essentially happens is that the so called append deletes the previous object and creates a new object with the previous object's data appended with some more data.
So, one way to do what I think you're trying is :
Suppose you have all the IDs written at 10:00 in an S3 object called data_corresponding_to_10_00_00. For the next hour(and 40000 updates), if they have all new IDs, you can write them to another S3 object with the name data_corresponding_to_11_00_00.
However, if you do not want multiple entries in both the files, and you need to update the previous file itself, using S3 is not a great idea. Rather use a database indexed on ID so that the performance becomes faster.
Is it possible to use AWS Athena to query S3 Object Tagging? For example, if I have an S3 layout such as this
bucketName/typeFoo/object1.txt
bucketName/typeFoo/object2.txt
bucketName/typeFoo/object3.txt
bucketName/typeBar/object1.txt
bucketName/typeBar/object2.txt
bucketName/typeBar/object3.txt
And each object has an S3 Object Tag such as this
#For typeFoo/object1.txt and typeBar/object1.txt
id=A
#For typeFoo/object2.txt and typeBar/object2.txt
id=B
#For typeFoo/object3.txt and typeBar/object3.txt
id=C
Then is it possible to run an AWS Athena query to get any object with the associated tag such as this
select * from myAthenaTable where tag.id = 'A'
# returns typeFoo/object1.txt and typeBar/object1.txt
This is just an example and doesn't reflect my actual S3 bucket/object-prefix layout. Feel free to use any layout you wish in your answers/comments.
Ultimately I have a plethora of objects that could be in different buckets and folder paths but they are related to each other and my goal is to tag them so that I can query for a particular id value and get all objects related to that id. The id value would be a GUID and that GUID would map to many different types of objects that are related e.g., I could have a video file, a picture file, a meta-data file, and a json file and I want to get all of those files using their common id value; please feel free to offer suggestions too because I have the ability to structure this as I see fit.
Update - Note
S3 Object Metadata and S3 Object Tagging are two different things.
Athena does not support querying based on s3 tag
one workaround is,
you can create a meta file which contains the tag and file mapping using lambda i.e whenever new file comes to s3 and lambda would update a file in s3 with tag and name details.
I have to store lots of photos (+1 000 000, one max 5MB) and I have a database, every record has 5 photos, so what is the best solution:
Create directory for each record's slug/id, and upload photos inside it
Put all photos into one directory, and in name contain id or slug of record
Put all photos into one directory, and in database to each record add field with names of photos.
I use Amazon S3 server.
i would suggest you to name your photos like this while uploading in batch:
user1/image1.jpeg
user2/image2.jpeg
Though these names would not effect the way objects are stored on s3 , these names will simply be 'keys' of 'objects', as there is no folder like hierarchical structure in s3 , but doing these will make objects appear in folders which will help to segregate images easily if you want later to do so.
for example , let us suppose you stored all images with unique names and you are using unique UUID to map records in database to images in your bucket.
But later on suppose you want all 5 photos of a particular user, then what will you have to do is
scan the database for particular username
Retrieve UUID's for the images of that user
and then using the UUID for fetching images from s3
But if you name images by prefixing username to it , you can directly fetch images from s3 without making any reference to your database.
For example, to list all photos of user1, you can use this small code snippet in python :
import boto3
s3 = boto3.resource('s3')
Bucket=s3.Bucket('bucket_name')
for obj in Bucket.objects.filter(Prefix='user1/'):
print(obj.key)
while if you don't use any user-id in key of object , then you have to refer database to do a mapping between photos and records even just to get a list of images of a particular user
A lot of this depends on your use-case, such as how the database and the photos will be used. There is not enough information here to give a definitive answer.
However, some recommendations for the storage side...
The easiest option is just to use a UUID for each photo. This is effectively a random name that has no meaning. Store that name in your database and your system will know which image relates to which record. There is no need to ever rename the images because the names are just Unique IDs and convey no further information.
When you want to provide access to a particular image, your application can generate an Amazon S3 pre-signed URL that grants time-limited access to an object. After the expiry time, the URL does not work so the object remains private. Granting access in this manner means that there is no need to group images into directories by "owner", since access is granted per-object rather than per-owner.
Also, please note that Amazon S3 doesn't actually support folders. Rather, the Key ("filename") of the object is the entire path (eg user-2/foo.jpg). This makes it more human-readable (because the objects 'appear' to be in folders), but doesn't actually impact the way data is stored behind-the-scenes.
Bottom line: It doesn't really matter how you store the images. What matters is that you store the image name in your database so you know which image matches which record. Avoid situations where you need to rename images - just give them a name and keep it.
Im using sitecore 8.1 and I have a bucket that accepts a bucketable template. When I right-click the bucket and add an item based on that template, then it gets added to bucket using the default bucket hierarchy (as expected).
The issue is the content editor doesn't display the added item. It in fact displays the parent folder item i.e. /sitecore/content/x/Resources/Document Repository/2016/04/25/20/35
How can I change this so that the added item gets displayed and not its parent?
Thanks
If viewing buckets is enabled (view tab in the ribbon) then the content editor usually loads the created (bucketable) item. Have you reviewed the log files for possible errors? Have you maybe changed / patched the original Sitecore.Buckets.config file?
I'm using Sitecore with DMS (Sitecore 7.2), and I'm setting up various controls on my layouts to pull content from different folders based on the users profile card. I'd like those folders to be "bucket" folders, since there'll be one folder for each profile card, and it'll be a bit unpleasant for authors to have to manually update all of these folders every time a new profile card is added.
The "Developers Guide to Item Buckets and Search" says:
by default, the items are organized according to the date and time of when the item was created, but this can be configured to use different behavior
Ideally I'd like to bucket my items on a field defined in a template that all of my "bucketable" item templates inherit from. I'll set that field to be a select dropdown from the list of profile cards.
I've found the Sitecore Autohaus demo with the Bucketing.GuidFolderPath class - looks like I need to define one of these classes with a GetFolderPath method? But then how do I tell my Sitecore bucket item that I want to bucket using that class?
You can indeed use a custom IDynamicBucketFolderPath and set that in the config (BucketConfiguration.DynamicBucketFolderPath), but that will change your default for all buckets.
You can define rules in sitecore to specify the folderstructure for a specific path/template/id/etc.
By default there are 3 rules: CreateDateBasedPath, CreateIDBasedPath and CreateItemNameBasedPath, but you can ofcourse add your own rules under /sitecore/system/Settings/Rules/Definitions/Elements/Bucketing/
You can change the bucketing strategy by two ways:
Using predefined bucketing rules. Navigate to item bucket settings stored at /sitecore/system/Settings/Buckets location and create a new rule (Bucketing Strategy: Item Creation Date) for resolving the bucket folder path.
Writing custom code for bucketing strategy. Write CustomBucketFolderPathResolver class which implements IDynamicBucketFolderPath interface and return folder path.
Detail information can be found at below post:
http://www.bugdebugzone.com/2014/07/configuring-sitecore-item-buckets-with.html
http://www.bugdebugzone.com/2014/07/configuring-sitecore-item-buckets-with_19.html