However I have models with lots of has_many's such as:
Account has_many Recipie_Books which has_many Recipes which has_many
Ingredients
The usual method of daisy-chaining them in the controller like:
current_user.account.books.recipies.ingredients.etc...find(:id)
will have a big overhead if we only want to check if the user can
access a record such as ingredients.
Alternative methods could be adding user or account to the final model
to allow quick checking, or adding a method to the model to do the
work like is_accessible?
So I suppose the question is, in the real world, what techniques have
you found work and scale well?
However I have models with lots of has_many's such as:
Account has_many Recipie_Books which has_many Recipes which has_many
Ingredients
The usual method of daisy-chaining them in the controller like:
current_user.account.books.recipies.ingredients.etc...find(:id)
will have a big overhead if we only want to check if the user can
access a record such as ingredients.
Alternative methods could be adding user or account to the final model
to allow quick checking, or adding a method to the model to do the
work like is_accessible?
So I suppose the question is, in the real world, what techniques have
you found work and scale well?
> However I have models with lots of has_many's such as: > Account has_many Recipie_Books which has_many Recipes which has_many > Ingredients
> The usual method of daisy-chaining them in the controller like: > current_user.account.books.recipies.ingredients.etc...find(:id) > will have a big overhead if we only want to check if the user can > access a record such as ingredients.
> Alternative methods could be adding user or account to the final model > to allow quick checking, or adding a method to the model to do the > work like is_accessible?
> So I suppose the question is, in the real world, what techniques have > you found work and scale well?
> -- > Andrew Threlfall
One way would be to use has_many :through, e,g.
Account has_many :recipe_books has_many :recipes, :through => :recipe_books end
ActiveRecord will use an inner join to load recipes in one query.
Unfortunately, there isn't support for nested has_many :through associations as of yet, so you wouldn't be able to go any deeper.
Alternatively, you could use the Searchlogic gem, which does some very nice things with deep associations and dynamic named scopes so you could do something like:
> One way would be to use has_many :through, e,g.
...
> Unfortunately, there isn't support for nested has_many :through
> associations as of yet, so you wouldn't be able to go any deeper.
Our models do go fairly deep, so I'd be worried about the performance
of lots of inner joins - just to check the user has the 'right' to see
the final instance.
Thanks for the Searchlogic pointer, could be useful in the future.
> The usual method of daisy-chaining them in the controller like: > current_user.account.books.recipies.ingredients.etc...find(:id) > will have a big overhead if we only want to check if the user can > access a record such as ingredients.
What's the business rule here that decides if a user can see an ingredient?
On 30 Oct, 18:05, Ashley Moran <ashley.mo...@patchspace.co.uk> wrote:
> On 30 Oct 2009, at 14:11, Andrew Threlfall wrote:
> > The usual method of daisy-chaining them in the controller like:
> > current_user.account.books.recipies.ingredients.etc...find(:id)
> > will have a big overhead if we only want to check if the user can
> > access a record such as ingredients.
> What's the business rule here that decides if a user can see an
> ingredient?
The system needs to be able to do processing to see the ingredients
based on a subset of the recipes eg vegetarian ones. So I essentially
want to see the ingredients in all the vegetarian recipes in all the
books in my account. In this instance the ingredients are all unique,
and not common for other recipes. Also if someone wants to view or
edit ingredient with id 20,000 how do I very quickly know if that one
of theirs?
The model names have been changed, but the analogy is more or the less
the same.
Technically it's possible with scoping, but more a case of wondering
if there is any good practice that will mean this can be done
efficiently. I don't want to be joining potentially hundreds of
thousands of rows, or adding permission methods to models, or adding
user or account ids to models further down the tree too early, but I
can see it becoming a problem very soon.
On Oct 31, 2009, at 11:46 am, Andrew Threlfall wrote:
> The system needs to be able to do processing to see the ingredients > based on a subset of the recipes eg vegetarian ones. So I essentially > want to see the ingredients in all the vegetarian recipes in all the > books in my account. In this instance the ingredients are all unique, > and not common for other recipes. Also if someone wants to view or > edit ingredient with id 20,000 how do I very quickly know if that one > of theirs?
You don't say if the books are specific to users or not, but presumably the recipes are specific to books. So I'm guessing your relations look like this (I hope you're have a fixed-width font):
Or the only difference is a many-many connection between Users and Books. Because you mention putting user ids further down the tree, I'm assuming that's not the case.
> The model names have been changed, but the analogy is more or the less > the same.
I did wonder if you really were working on Delia 3.0 =)
> Technically it's possible with scoping, but more a case of wondering > if there is any good practice that will mean this can be done > efficiently. I don't want to be joining potentially hundreds of > thousands of rows, or adding permission methods to models, or adding > user or account ids to models further down the tree too early, but I > can see it becoming a problem very soon.
I can see a few options:
* Suck it and see
Take a chance that performance will not be an issue. Without benchmarking, I'd (personally) be inclined to go down this route. But the risk rises as the consequences of long page loads become more severe, or if you thought you'd lose business faster than you could fix the problem. If users would put up with poor performance for a few weeks, and you know the lead time on implementing a story is 2-3 days, it may be a risk worth taking.
* Make the work asynchronous
This may or may not be an option based on the way you're accessing the data. But you could load the screened data asynchronously, and use JavaScript to make the interface more responsive. You'd still have the performance issue, but it may be perceived as less severe, which would reduce the pressure to fix it later.
* Implement explicit permissions at each level
This would involve adding user_book_permission, user_recipe_permission etc tables. You'd have to enforce consistency with business rules somewhere (presumably, you'd have to have recipe permissions to get ingredient permissions). There would be a lot more work there on setting permissions, but a lot less work using them.
I don't have a name for this technique, but I think of it as "naming the fact". You could consider the fact that a user can see an ingredient as implicit, because he owns it, or you could "name the fact" of user-has-permission-to-see-ingredient and record it explicitly. Often this can be done without introducing redundancy if you name the fact right.
Ask me to elaborate if you want, I'm not sure that explanation is clear.
* Benchmark and optimise
Do you have an idea of the proportion of data real users will have at each point? If the Ingredients is where most of the data goes it might not even be an issue.
The join to get at the Ingredients for a category of recipes would be something like (I'm using natural join[1] to save writing all the fluff, although this doesn't actually work with ActiveRecord naming conventions):
ingredients NATURAL JOIN recipes NATURAL JOIN recipe_categories NATURAL JOIN books NATURAL JOIN users WHERE recipe_category_id IN (X) AND user_id = Y
Be aware I obviously didn't test that, so it might actually be wrong.
Now, then regardless of how many ingredients you have, it depends if your query optimiser will realise that `WHERE recipe_category_id IN (X) ` will restrict the join soonest. ie, will it find all the requested recipes and either return them, or throw them away because the user_id was wrong? Postgres does this quite happily, I can't say for MySQL (I've met people who've had to force the order of the query clauses to make the optimiser work correctly).
I don't know how to go about generating that query in ActiveRecord, but an interesting discussion on the DataMapper list came up recently [2].
If you don't have the option to restrict the query further down, ie a user could request an Ingredient with no context, you'd have to rely on join performance. Is this a normal use case? If not, you could suck it and see.
Hmm, that turned into a bit of an essay once I got started. Not sure how helpful it is, but it was useful to me as I now realise why I favour "suck it and see". Anyway, that's the thought process I use, even if the concrete examples are not relevant.
> The usual method of daisy-chaining them in the controller like: > current_user.account.books.recipies.ingredients.etc...find(:id) > will have a big overhead if we only want to check if the user can > access a record such as ingredients.
I talked to Andy about this on Friday and have seen the actual specific problem code he's dealing with. In fact, it's v2.0 of code I wrote a couple of years ago. :-)
My suggestion (and I've just hinted at it on GeekUp in a query on optimisation), is to chain it the other way around. This scoping is being used to basically say "Can this user see this ingredient?". Another way is to say "Is this ingredient in a relationship viewable by this user?" so you could:
Ingredient.find(:id).recipe.book.account.users
That gives me all the users up the belongs_to chain who can see the ingredient, so I could have a method on Ingredient to return "allowabled" users and a before_filter or a helper that determines if the current_user is in that group.
Chaining back up the belongs_to route is much more efficient in terms of CPU and RAM than lots of inner joins. I accept it's not perhaps the purist way to do things but it works and it's clean enough, IMHO.
On Mon, Nov 2, 2009 at 1:54 PM, Paul Robinson <p...@vagueware.com> wrote:
> On 30 Oct 2009, at 14:11, Andrew Threlfall wrote:
> > The usual method of daisy-chaining them in the controller like:
> > current_user.account.books.recipies.ingredients.etc...find(:id)
> > will have a big overhead if we only want to check if the user can
> > access a record such as ingredients.
> I talked to Andy about this on Friday and have seen the actual
> specific problem code he's dealing with. In fact, it's v2.0 of code I
> wrote a couple of years ago. :-)
> My suggestion (and I've just hinted at it on GeekUp in a query on
> optimisation), is to chain it the other way around. This scoping is
> being used to basically say "Can this user see this ingredient?".
> Another way is to say "Is this ingredient in a relationship viewable
> by this user?" so you could:
> Ingredient.find(:id).recipe.book.account.users
> That gives me all the users up the belongs_to chain who can see the
> ingredient, so I could have a method on Ingredient to return
> "allowabled" users and a before_filter or a helper that determines if
> the current_user is in that group.
> Chaining back up the belongs_to route is much more efficient in terms
> of CPU and RAM than lots of inner joins. I accept it's not perhaps the
> purist way to do things but it works and it's clean enough, IMHO.
I think you've still got a bad Demeter violation, but I can't think off hand
how you'd fix it without resorting to hiding stuff in views or creating
another joining table that just does the direct mapping between the two
entities - maybe this would be easier, creating some triggers that police it
(in Ruby or MySQL, whatever)? Plus you've always got the old find_by_sql
method that you can hide. This might be better and more obvious.
I think the usual fixit approach of creating accessor methods (which is what
through is for) would just be a complete PITA and just push the Demeter
problem downward with zero gain.