again come guys expertise , advice on issue having. wondering if of know how detect if web page has been modified using vb.net. need able set task periodically (like once week) scans user inputted web pages , if web page content has changed, need fire off email individual has changed (not exact location on page itself). i'll storing http status , of course page data date of when last modified. of course needs fault tolerant since week before check runs again. great. thank you.
edit
new twist on question sorry. had more time think wanted. so... detecting change on web page kind of silly since time dependent elements of page change every often. instead, able detect documents in page. instance if there excel, word docs, or pdfs changed on page. so, i'd run hash on these documents on sort of schedule check see if new documents have been added or if old documents have been modified. suggestions on how detect documents embedded on page , running hash? again!
as mentioned in comment, sort of job checksums (also known hash functions) designed for.
you code this:
- each webpage of interest - pull webbpage - calculate checksum of contents - current checksum different last checksum? - if yes, send email - store new checksum , other appropriate data the .net framework has number of checksums available. 2 popular md5 , sha1
Comments
Post a Comment