i have read how boolean columns don't serve searching indexes.. question is.. if clustered index, affects physical arrangement of records can't used put type of records, (in same page) page have less chance of being loaded memory.. try explain better: table
[bookpages] id(int) deleted(boolean) text(varchar) if clustered index on id column, sample data be
1, true, 'the quick..' 2, false, 'hello w..' 3, true, 'stack m..' 4, false, 'just thin...' this means delete/active records interleaved, if search record 2
select [text] [bookpages] [deleted] = false , [id] = 2 the "leaf" data page may end rows (1,2) mean loading memory, records deleted field, never interested in.. if index in columns deleted,id data be
2, false, 'hello w..' 4, false, 'just thin...' 1, true, 'the quick..' 3, true, 'stack m..' now, when target active records sql loads pages, have pages full of active records..
so on database long history , lot deleted records, can have better locality on records want, , io..
and on thousands of pages can make sure large chunk of them never loaded on memory, , data remain on disk.
is reasoning correct? may impact(improve) overall performance on large databases?
yes, reasoning correct. can in effect partition data set 2 regions, 1 hot , 1 cold. using bit special case of technique. use date column , cluster on (of course whether feasible or not depends on schema , data).
partitioning has similar effect. choosing clustering key lighter weight , though.
oftentimes clustering on auto-incremented number has locality because identity value correlates age , age correlates frequency of usage.
the same optimization not apply directly nonclustered indexes. can use boolean prefix them, too, need provide in sargable form:
where somencindexcol = '1234' , deleted in (0, 1) sql server not smart enough figure out itself. cannot "skip" first index level oracle can. have provide seek keys manually. (connect item: https://connect.microsoft.com/sqlserver/feedback/details/695044)
a different concern write performance. marking row deleted (set deleted = 1) requires physical delete+insert pair ci plus 1 each nci. primary key changes not supported orms should not set clustering key primary key.
as side note creating index on bit column has other use cases well. if 99% of values 0 or 1 can use index perform seek , key lookup. can use such index counting (or grouping on bit column).
Comments
Post a Comment