I have a database with files which can be searched, browsed and have multiple copies on multiple servers.

I cache searches, browse pages and server locations (urls). Say I delete a file, what's a good way to invalidate all searches, browse data and urls for this file? Or if a file server goes down, and I need to invalidate all urls pointing to this server?

Essentially I'm looking for something similar to memcache-tags, but with standard memcache and php components. (Without having to change anything on the web server itself). I need some sort of many to many relation (one server has many files, and one file has multiple servers) in between keys, but can't seem to figure out a good way to accomplish this. In some situations stale cache is acceptable (minor updates etc.), but in some cases it's not (typically delete, and server down) where I need to invalidate all cache items containing references to it.

Some approaches I have looked at:

Namespaces:

$ns_key = $memcache->get("foo_namespace_key");
// if not set, initialize it
if($ns_key===false) $memcache->set("foo_namespace_key", rand(1, 10000));
// cleverly use the ns_key
$my_key = "foo_".$ns_key."_12345";
$my_val = $memcache->get($my_key);

//To clear the namespace do:
$memcache->increment("foo_namespace_key");
  • Restricts the cache key to a single namespace

Item caching approach:

$files = array('file1','file2');
// Cache all files as single entries
foreach ($files as $file) {
  $memcache->set($file.'_key');
}
$search = array('file1_key','file2_key');
// Retrieve all items found by search (typically cached as file ids)
foreach ($search as $item) {
  $memcache->get($item);
}
  • Gives a problem if a file server is down, and all keys containing urls to this server should be invalidated (E.G large number of small cache items would be needed which in turn requires a large amount of requests against cache)- breaks any chance of caching full objects and resultsets

Tag implemenation:

class KeyEnabled_Memcached extends Zend_Cache_Backend_Memcached
{

    private function getTagListId()
    {
        return "MyTagArrayCacheKey";
    }

    private function getTags()
    {
        if(!$tags = $this->_memcache->get($this->getTagListId()))
        {
            $tags = array();
        }
        return $tags;
    }

    private function saveTags($id, $tags)
    {
        // First get the tags
        $siteTags = $this->getTags();

        foreach($tags as $tag)
        {
            $siteTags[$tag][] = $id;
        }
        $this->_memcache->set($this->getTagListId(), $siteTags);        
    }

    private function getItemsByTag($tag)
    {
        $siteTags = $this->_memcache->get($this->getTagListId());
        return isset($siteTags[$tag]) ? $siteTags[$tag] : false;
    }

    /**
     * Save some string datas into a cache record
     *
     * Note : $data is always "string" (serialization is done by the
     * core not by the backend)
     *
     * @param  string $data             Datas to cache
     * @param  string $id               Cache id
     * @param  array  $tags             Array of strings, the cache record will be tagged by each string entry
     * @param  int    $specificLifetime If != false, set a specific lifetime for this cache record (null => infinite lifetime)
     * @return boolean True if no problem
     */
    public function save($data, $id, $tags = array(), $specificLifetime = false)
    {
        $lifetime = $this->getLifetime($specificLifetime);
        if ($this->_options['compression']) {
            $flag = MEMCACHE_COMPRESSED;
        } else {
            $flag = 0;
        }
        $result = $this->_memcache->set($id, array($data, time()), $flag, $lifetime);
        if (count($tags) > 0) {
            $this->saveTags($id, $tags);
        }
        return $result;
    }

    /**
     * Clean some cache records
     *
     * Available modes are :
     * 'all' (default)  => remove all cache entries ($tags is not used)
     * 'old'            => remove too old cache entries ($tags is not used)
     * 'matchingTag'    => remove cache entries matching all given tags
     *                     ($tags can be an array of strings or a single string)
     * 'notMatchingTag' => remove cache entries not matching one of the given tags
     *                     ($tags can be an array of strings or a single string)
     *
     * @param  string $mode Clean mode
     * @param  array  $tags Array of tags
     * @return boolean True if no problem
     */
    public function clean($mode = Zend_Cache::CLEANING_MODE_ALL, $tags = array())
    {
        if ($mode==Zend_Cache::CLEANING_MODE_ALL) {
            return $this->_memcache->flush();
        }
        if ($mode==Zend_Cache::CLEANING_MODE_OLD) {
            $this->_log("Zend_Cache_Backend_Memcached::clean() : CLEANING_MODE_OLD is unsupported by the Memcached backend");
        }
        if ($mode==Zend_Cache::CLEANING_MODE_MATCHING_TAG) {
            $siteTags = $newTags = $this->getTags();
            if(count($siteTags))
            {
                foreach($tags as $tag)
                {
                    if(isset($siteTags[$tag]))
                    {
                        foreach($siteTags[$tag] as $item)
                        {
                            // We call delete directly here because the ID in the cache is already specific for this site
                            $this->_memcache->delete($item);
                        }
                        unset($newTags[$tag]);
                    }
                }
                $this->_memcache->set($this->getTagListId(),$newTags);
            }
        }
        if ($mode==Zend_Cache::CLEANING_MODE_NOT_MATCHING_TAG) {
            $siteTags = $newTags = $this->getTags();
            if(count($siteTags))
            {
                foreach($siteTags as $siteTag => $items)
                {
                    if(array_search($siteTag,$tags) === false)
                    {
                        foreach($items as $item)
                        {
                            $this->_memcache->delete($item);
                        }
                        unset($newTags[$siteTag]);
                    }
                }
                $this->_memcache->set($this->getTagListId(),$newTags);
            }
        }
    }
}
  • No control over what keys are invalidated when, due to internal memcache key dropping, can drop a tag key which in turn invalidate a large number of actual valid keys (which would still exist)
  • Issues with write concurrency

Two-step cache system:

// Having one slow, and one fast cache mechanism where the slow cache is reliable storage  containing a copy of tag versions 
$cache_using_file['tag1'] = 'version1';
$cache_using_memcache['key'] = array('data' = 'abc', 'tags' => array('tag1' => 'version1');
  • Potential bottleneck using disk/mysql etc. for the slow cache
  • Issues with write concurrency

Comments

I'm a fanboy for namespacing. Also remember, keys can be 250 characters long, so if you have a programmatic way of generating namespaces you can tack on multiple namespace's in front of a key.

Written by David

See stackoverflow.com/questions/322709/php-memcache-design-patterns

Written by Alex Howansky

Accepted Answer

Having come accross the comment here, which explains the logic of evicting existing keys, I believe tags can be implemented reliable by the version flags approach mentioned in: PHP memcache design patterns

I actually implemented this logic once already, but discarded it as unreliable due to memcache eviction of elements before they expire. You can find my initial implementation here. I do however, believe this is a reliable tags pattern because:

  • On cache object retrieval, the object is moved on top of the LRU stack
  • All tags/flags are retrieved right after the cache object
  • If a flag is evicted, any item containing this flag would have been evicted just before
  • Incrementing a flag returns the new value in the same operation, avoiding write concurrency. If a second thread is incrementing the same flag before the cache object is written, it is by definition invalid already

Please correct me if I'm wrong! :-)

Written by Jon Skarpeteig
This page was build to provide you fast access to the question and the direct accepted answer.
The content is written by members of the stackoverflow.com community.
It is licensed under cc-wiki