Query at JPA repo from list object

Query at JPA repo from list object - list

Need to ask query in JPA repository.
I have entity mapped like this
private String name;
private Set<String> roles;
There are two different string roles "users" and "managers"
Need to list all users and all managers respectively
tried
findAllByRolesContains(String role);
Than tried:
return userRepository.findAllByRolesContains("users");
Received response
Parameter value [%users%] did not match expected type
How should I ask it right?
tried
findAllByInRolesContains(String role);
it did not work at all

So first off, I see you didn't add the tags for "spring" and "spring data jpa" to your question. I want to remind you, that you are showing code snippets for Spring Data JPA - so a part of the Spring framework that builds upon the Java Persistence API (JPA). Make sure you understand the basics (JPA) first! Spring Data builds JPA queries under the hood and automates a lot of stuff, but it won't save you from knowing how to use JPA!
Next, if you have specific roles, i.e., not arbitrary strings ("role1324"), you might want to map an Enum instead.
But let's look at your case:
#Entity
public class User {
#Id
private String name;
#ElementCollection
private Set<String> roles;
}
So you are trying to create a query method for your repository - Spring Data generates queries based on the name of the method. The rules can be found in the reference documentation.
So what does findAllByRolesContains generate? Anything between 'find' and 'By' is just descriptive, i.e., ignored. 'Roles' is the property and 'Contains' may mean "contained in a collection" or "contained in a string".
But we are missing a level here - you want to check if a string in a collection contains something. And I have not found a way to tell Spring Data to create the correct query.
findAllByRolesLike is the closest one, but then you have to manually wrap the search string with '%'s.
Getting Spring Data to create the correct query can be fickle and if it's not straight forward, a waste of time IMHO. You can always write the query yourself:
#Query("select user from User user join user.roles role where role like %:role%")
List<User> findAllByRolesContaining(#Param("role") String role);
But if you test this with a User with roles ["admin", "administrator"], you get the same user twice?! What gives?! Well, that's just how the underlying SQL works - it filters rows and there are two rows with "username - admin" and "username - administrator".
You want the distrinct keyword to filter them out:
#Query("select distinct user from User user join user.roles role where role like %:role%")
List<User> findAllByRolesContaining(#Param("role") String role);
One more optimization: If roles is an #ElementCollection, they are fetched lazily by default. If you know that you will need to access it, you can add a join fetch user.roles before the join. This will load it eagerly. You can create 2 different methods if you like. You want to avoid fetching them if you don't need them, but if you know you'll need them, lazy loading may issue multiple queries.
The wikibook on JPA is a great resource and the book and blog of Vlad Mihalcea are the best resources on high performance JPA you can find.
Also - don't rely too much on the auto-magic of Spring Data JPA, IMHO it does more harm than good for anything but the most trivial of queries...

Related

DynamoDB 1 big table or multiple small tables?

I'm currently facing some questions regarding my database design. Currently i'm developing an api which lets users do the following:
Create an Account ( 1 User owns 1 Account)
Create a Profile ( 1 Account owns 1-n Profiles)
Let a profile upload 2 types of items ( 1 Profile owns 0-n Items ; the items differ in type and purpose)
Calling the API methods triggers AWS Lambda to perform the requested operations in the DynamoDB tables.
My current plan looks like this:
It should be possible to query items by specifying a time frame and the Profile ID. But i think my design completely defeats the purpose of DynamoDB. AWS documentation says that a well designed product only requires one table.
What would be a good way to realise this architecture in one table?
Are there any drawbacks on using the current design?
What would you specify as Primary/Partition/sort key/secondary indexes in both the current design and a one-table-approach?

I’m going to give this answer assuming that you need to be able to do the following queries.
Given an Account, find all profiles
Given a Profile, find all Items
Given a Profile and a specific ItemType, find all Items
Given an Item, find the owning Profile
Given a Profile, find the owning account
One of the beauties of DynamoDB (and also a bane, perhaps) is that it is mostly schema-less. You need to have the mandatory Primary Key attributes for every item in the table, but all of the other attributes can be anything you like. In order to have a DynamoDB design with only one table, you usually need to get used to the idea of having mixed types of objects in the same table.
That being said, here’s a possible schema for your use case. My suggestion assumes that you are using something like UUIDs for your identifiers.
The partition key is a field that is simply called pkey (or whatever you want). We’ll also call the sort key skey (but again, it doesn’t really matter). Now, for an Account, the value of pkey is Account-{{uuid}} and the value of skey would be the same. For a Profile, the pkey value is also Account-{{uuid}}, but the skey value is Profile-{{uuid}}. Finally, for an Item, the pkey is Profile-{{uuid}} and the skey is Item-{{type}}-{{uuid}}. For all of the attributes of an item, don’t worry about it, just use whatever attributes you want to use.
Since the “parent” object is always the partition key, you can get any of the “child” objects simply by querying for the ID of the of the parent. For example, your key condition expression to get all the ‘ItemType2’s for a Profile would be
pkey = “Profile-{{uuid}}” AND begins_with(skey, “Item-Type2”)
In this schema, your GSI has the same keys as the table, but reversed. You can query the GSI for ‘Item-{{type}}-{{uuid}}’ to get the owning Profile, and similarly with a Profile is to get the owning account.
What I have illustrated here is the adjacency list pattern. DynamoDB also has an article describing how to use composite sort keys for hierarchical data, which would also be suitable for your data, and depending on your expected queries, it might be more suitable than using the adjacency list.
You don’t have to put everything in a single table. Yes, DynamoDB recommends it, but it is far more important to make sure that your application is correct and maintainable. If having multiple tables means it’s easier to write a defect free application, then use multiple tables.

Using EXISTS and GetProfileAttrAsList in a Siebel calculated field

In our Siebel 7.8 application, we have three entities: service requests (SR), groups and employees. Each employee can be member of one or many groups, and each service request can be assigned to one or many groups too.
I have a requeriment to create a calculated field on the service request BC, which will indicate if the current user belongs to any of the groups asociated with the service request.
I already have created a multivalue field, called SR Groups, on the service request BC. I have also another multivalue field, Employee Groups; this one is on the Personalization Profile business component, which means that Siebel will generate automatically a multivalued profile attribute with the same name. All of the above is working as expected.
Next I've created this calculated field:
IIf(InList([SR Groups], GetProfileAttrAsList("Employee Groups")), "Y", "N")
It works, but it only checks if the SR's primary group is one of the current user's groups. I need to check all the SR groups, not only the primary one. So, I have created another calculated field:
IIf(EXISTS([SR Groups] = GetProfileAttrAsList("Employee Groups")), "Y", "N")
This one doesn't work. It shows always "N". However, according to this Bookshelf document:
a typical usage of the EXISTS operator in this scenario is EXISTS ([Targeted States] = GetProfileAttrAsList("State")). This does a many-to-many match of the MVG Business Component Field Targeted State against the MVG profile attribute State.
Which is exactly what I'm trying to do, without success. I can't see any difference between my expression and the example one. And there isn't any of the typical Bookshelf warnings, like "if you're going to use this function, you must activate the Link Specification property of the MVF", or anything like that.
The business component is based on a specialized class, CSSBCServiceRequest, but I don't think that should be a problem in this case - switching it to CSSBCBase doesn't fix the issue either. The only thing not working seems to be the EXISTS operator, which is pretty standard in Siebel.
Also, if I execute a query on the application with the expression EXISTS([SR Groups] = GetProfileAttrAsList("Employee Groups")), it doesn't filter out any service request as it should.
Any clues?

After a lot of testing, I've been able to figure out a workaround. I'd still like to know why my first attempt didn't work, but anyway...
Given that the problem with my first attempt seemed to be matching a many-to-many relationship between the MVF and the multivalued profile attribute, I've split it in two one-to-many matches:
In the link, I've stablished a search specification property. This way, my multivalue field will contain only groups associated with the user:
InList([Group], GetProfileAttrAsList("Employee Groups"))
In the BC, it only remains to check if there is any value in the MVF or not:
IIf(EXISTS([Filtered SR Groups] IS NOT NULL), "Y", "N")

Doctrine 2 How to set an entity table name at run time (Zend 2)

I'm building a product with Zend 2 and Doctrine 2 and it requires that I have a separate table for each user to contain data unique to them. I've made an entity that defines what that table looks like but how do I change the name of the table to persist the data to, or in fact retrieve the data from, at run time?
Alternatively am I going to be better off giving each user their own database, and just changing which DB I am connecting to?

I'd question the design-choice at first. What happens if you create a new user after runtime. The table has to be created first? Furthermore, what kind of data are you storing, to me this sounds like a pretty common multi-client capabilities. Like:
tbl_clients
- id
- name
tbl_clientdata
- client_id
- data_1_value
- data_2_value
- data_n_value

If you really want to silo users data, you'd have to go the separate databases route. But that only works if each "user" is really independent of each other. Think very hard about that.
If you're building some kind of software-as-a-service, and user A and user B are just two different customers of yours, with no relationship to each other, then an N+1 database might be appropriate (one db for each of your N users, plus one "meta" database which just holds user accounts (and maybe billing-related stuff).
I've implemented something like this in ZF2/Doctrine2, and it's not terribly bad. You just create a factory for EntityManager that looks up the database information for whatever user is active, and configures the EM to connect to it. The only place it gets a bit tricky is when you find yourself writing some kind of shared job queue, where long-running workers need to switch database connections with some regularity -- but that's doable too.

How do you implement multi-tenancy on CouchBase? Can it be performant?

I'm considering an app which will store customer data. Given the way buckets work in CouchBase, all customer data will be in one bucket. It appears that I have two choices:
Implement multi-tenancy in views, by assigning a field to each record that indicates the customer it belongs to.
Implement it by putting a factor on every key that is a customer ID.
It seems, though, that since I will be using views, I'll really want to do both. In case number 2, I need to have the data in the record so that it can be indexed on (or maybe I can pull out part of the key in the map phase and index on customer) and in option 1, I'd want it to be part of the key as a check when retrieving data to make sure I don't send the wrong customers data down the line.
The problem is, this is a service where multiple customers will interact, and sometimes one customer will create some data and the other will view it, at the first customers request. But putting an ACL on each record that lists everyone who's authorized to view it would be problematic, to say the least.
I bet there is a common methodology or design pattern to answer this question, and would appreciate some pointers to best practices.
I'm also concerned about the performance if the indexes are indexing both on the particular piece of relevant data, and the customer id... a large number of different customers would presumably make the indexes much less efficient. (but maybe not.)

Here are my thoughts on your questions:
[Concerning items #1 and 2] - It seems, though, that since I will be using views, I'll really want to do both.
This doesn't seem to make sense to me. In Couchbase, the map phase can include content from both the key and the value. It makes little sense to store the data in both the key and the value, as you are guaranteed to have 1:1 duplication there. Store it wherever it makes the most sense to store it; in this case, probably the value.
The problem is, this is a service where multiple customers will interact, and sometimes one customer will create some data and the other will view it, at the first customers request. But putting an ACL on each record that lists everyone who's authorized to view it would be problematic, to say the least.
My site also has muti-tenant data stored in a single database. In my case, I use object unique identifiers as my keys. By default, customers can access all objects that belong to them (I have a user object, and the user is associated with a customer account). Users may also have additional permissions assigned to them, whereby a single object from another customer could be added to their user account, and they would thereby be granted access to view the object.
The alternative is "security through obscurity" and use guids as a random identifier, giving customers access to view any object that they have the guid for.
I would not, however, try to store the permissions on the objects themselves. That would quickly become unwieldy. You need to think about your specific use case, and decide what simple approach would work for the majority of the cases, and just not support the other 1-2% of the cases.

Unique alpha-numeric string from a unique integer? (masking IDs in a game server?)

I have a multiplayer mobile game out in the wild, it's backed by a sql database. Each game gets an ID which is just an auto-increment field. I can look up a game with a url like:
http://www.example.com/gameId=123
That url is not visible to players at the moment, but I was thinking of displaying it so users can invite friends and let non-players look on in the game as they play (through a browser - at the moment everyone plays through a native app).
But the fact that I'm putting the game ID out there in the open seems like a bad idea. If someone guessed an endpoint for say deleting a game, they could do bad stuff knowing the ID (of course my endpoints are protected by user auth, but still).
Do most services mask IDs of this sort, should I send out a url like:
http://www.example.com/gameId=maskedIdAbc
and then my game server has to translate that ID into the corresponding ID in my database?
Not sure if that's overkill. If not, what's a good way to generate a unique alpha-numeric string based off a unique integer?
Thanks

Why not change the primary key of the game from an incremental ID to a GUID? The game is out in the wild but you should be able to get there in a number of steps. Add the Guid as a Field and allow games to be looked up either by ID or GUID. Update your clients to use the GUID, phase out the ID, and finally change the primary key to be the GUID.
You could hash the int, or even use the hex, but its breakable. Better to implement a complete fix, if you don't want to use a GUID you could implement your equivalent random characters that you store against each db record but why go to the trouble when GUIDs are usually Nativity supported by databases.

If range of the integers is not big, you may define tabble with unique, random alpha-num strings. I think it's the best way.

I has a similar situation and did not want to use the gameID (using your example here) in the url, as someone can use any number. I can still use the ID's, but need to add additional checking for authorizing the users.
You could use UUID to generate gameID's but I see few problems with this;
- non numeric ID will have an impact on the performance
- if this is the primary key and want to use it as FK on other tables, space
What I did;
In addition to gameID in my table, I added another column WebGameID varchar (32). After the game ID was generated, updated the WebGameID = MD5(gameID). This will be a unique 32 char string for the specific gameID. With this I was able to use gameID for internal keys and FK's ad only use WebGameID for the URL for limiting user manipulation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js