Extending FalkorDB with User-Defined Functions and FLEX

Extending FalkorDB with User-Defined Functions and FLEX

Highlights

FalkorDB now supports user-defined functions (UDFs) written in JavaScript, enabling developers to extend graph database capabilities. This extensibility model addresses a persistent limitation in database systems where custom logic requirements fall outside built-in function scope.

UDFs execute within the query engine alongside native Cypher operations, processing nodes, edges, and paths with full access to graph structure. The FLEX (FalkorDB Library of Extensions) package provides production-ready functions for string manipulation, similarity calculations, and data transformations commonly needed in graph analytics pipelines.

FLEX Library Coverage Overview
FLEX Library Coverage
FalkorDB Library of Extensions
Similarity
flex.sim.*
Set similarity metrics for fuzzy matching and comparison
Text
flex.text.*
String manipulation, formatting, case conversion, and string similarity
Collection
flex.coll.*
Set operations, transformations, and array utilities
Map
flex.map.*
Property manipulation and object transformation
JSON
flex.json.*
JSON serialization and parsing utilities
Date
flex.date.*
Date and time manipulation, formatting, and parsing
Bitwise
flex.bitwise.*
Low-level bitwise operations on integers

Why JavaScript for Graph Database Extensions

Database engines face a trade-off when supporting user-defined logic: flexibility versus integration complexity. JavaScript emerged as the pragmatic choice for FalkorDB UDFs due to its lightweight runtime and straightforward embedding model. Python, while ubiquitous in data engineering, requires more complex packaging and dependency management when embedded in database systems.

UDFs improve application performance by eliminating round-trips between database and application layers. Calculations that previously required fetching data, processing in application code, and writing results back now execute in a single query operation.

FLEX Function Category Browser
Similarity
flex.sim.*
sim.jaccard()
Fuzzy Matching & Comparison
Text
flex.text.*
text.levenshtein()
text.jaroWinkler()
text.camelCase()
text.replace()
String Processing & Normalization
Collection
flex.coll.*
coll.union()
coll.intersection()
coll.frequencies()
Set Operations & Aggregation
Map
flex.map.*
map.merge()
map.submap()
map.removeKeys()
Property Manipulation
JSON
flex.json.*
json.toJson()
json.fromJsonMap()
json.fromJsonList()
API & Data Exchange
Date
flex.date.*
date.truncate()
date.format()
date.parse()
Temporal Operations
Bitwise
flex.bitwise.*
bitwise.and()
bitwise.or()
bitwise.xor()
Permission & Flag Management

“UDFs enable SQL queries to execute faster by being compiled and stored in the database. Besides, UDFs can prevent round-trips between the database and the application, thus optimizing the performance of programming.” – TowardsDataScience

Implementing Custom Functions in FalkorDB

FalkorDB exposes a falkor object within the JavaScript runtime that registers functions for Cypher query access. Each UDF library contains one or more JavaScript functions, with explicit registration determining which functions become queryable.

The basic registration pattern:

				
					function UpperCaseOdd(s) {
  return s.split('')
    .map((char, i) => i % 2 !== 0 ? char.toUpperCase() : char)
    .join('');
}

falkor.register('UpperCaseOdd', UpperCaseOdd);
				
			

Loading the UDF library requires three components: library name, JavaScript source code, and connection to the FalkorDB instance. The Python client provides udf_load() for library management, though the underlying GRAPH.UDF LOAD command works via direct Redis connections.

Functions operate on five data types: scalars, nodes, edges, paths, and collections. Node objects expose id, labels, and `attributes` properties, plus a getNeighbors() method for traversal operations. Edge objects provide id, type, startNode, endNode, and attribute. Path objects contain nodes, length, and relationships arrays.

Graph-Native Operations in UDFs

Standard built-in functions handle common calculations, but domain-specific logic often requires custom implementations. FalkorDB’s UDF model excels when query logic demands programmatic control over graph traversal patterns.

Consider Jaccard similarity for nodes, measuring overlap in their neighbor sets​. The formula J(A,B)=∣A∩B∣/∣A∪B∣J(A,B) = |A \cap B| / |A \cup B|J(A,B)=∣A∩B∣/∣A∪B∣ requires collecting neighbors, computing set operations, and calculating ratios:

				
					function jaccard(a, b) {
  const aIds = a.getNeighbors().map(x => x.id);
  const bIds = b.getNeighbors().map(x => x.id);
  const unionSize = union(aIds, bIds).length;
  const intersectionSize = intersection(aIds, bIds).length;
  return unionSize === 0 ? 0 : intersectionSize / unionSize;
}

				
			

This UDF invokes Cypher-style traversals via getNeighbors() while applying JavaScript logic for set operations. Query authors reference the function as similarity.jaccard(node1, node2) after loading the library.

Custom traversals demonstrate UDF value when standard Cypher patterns prove insufficient. A depth-first search that expands only to neighbors meeting numeric threshold conditions requires explicit control flow:

				
					function DFS_IncreasingAmounts(n, visited, total, reachables) {
  visited.push(n.id);
  for (const neighbor of n.getNeighbors()) {
    if (visited.includes(neighbor.id) || neighbor.amount <= total) {
      continue;
    }
    reachables.push(neighbor);
    DFS_IncreasingAmounts(neighbor, visited, total + neighbor.amount, reachables);
  }
}

				
			

This pattern handles scenarios where relationship traversal depends on accumulated path properties rather than static graph structure.

FLEX Library Functions

FLEX provides categorized function sets for common data engineering operations. The library targets fuzzy matching, text normalization, temporal calculations, and bitwise operations that graph queries frequently require.

Text and Similarity Operations

String similarity metrics enable approximate matching when exact comparisons fail. Levenshtein distance computes character-level edit distance between strings, while Jaro-Winkler similarity optimizes for short strings like names. The text.levenshtein() function returns an integer representing minimum edits needed to transform one string into another.

Text manipulation functions normalize data formats across heterogeneous sources:

  • text.camelCase() and text.snakeCase() standardize field naming conventions
  • text.replace() applies regex patterns for character sanitization
  • text.format() performs placeholder substitution for template strings

Case conversion utilities include text.capitalize(), text.decapitalize(), and text.swapCase() for presentation layer transformations.

Collection and Map Operations

Collection functions implement set theory operations on arrays. coll.union() merges lists with deduplication, while coll.intersection() returns shared elements. coll.frequencies() counts element occurrences, useful for aggregation queries that group by value distribution.

Map functions manipulate property objects common in graph node attributes:

  • map.merge() combines multiple property maps
  • map.submap() extracts specific keys
  • map.removeKeys() filters sensitive properties before serialization

The map.fromPairs() function converts key-value tuple arrays into objects, supporting dynamic property construction in query results.

Temporal and Bitwise Functions

Date functions handle timezone conversions and temporal grouping. date.truncate() rounds timestamps to specified units (day, month, year) for time-series aggregations. date.format() applies pattern-based string formatting with timezone awareness. date.parse() converts string representations to date-time objects with optional format hints.

Bitwise operations enable low-level integer manipulation for permission flags and binary protocols. Standard operators include bitwise.and(), bitwise.or(), bitwise.xor(), and bitwise.not(). Shift operations via bitwise.shiftLeft() and bitwise.shiftRight() support bit-level calculations without leaving the query context.

Common Use Cases

FLEX functions address recurring patterns in graph data processing pipelines. These use cases demonstrate how pre-built functions solve practical problems developers encounter when building graph-backed applications.

FLEX Function Use Case Matcher
Task-to-Function Mapper
Find the right FLEX function for your use case
Data Cleaning & Normalization
I need to normalize field names across different data sources
text.camelCase()
text.snakeCase()
I need to remove or sanitize unwanted characters
text.replace()
I need to deduplicate lists from merged data
coll.union()
Fuzzy Matching & Search
I need to find similar strings with edit distance calculations
text.levenshtein()
I need to match names and short strings accurately
text.jaroWinkler()
I need to compare sets and calculate tag similarity
sim.jaccard()
Data Aggregation & Analysis
I need to group data by time periods (day, week, month)
date.truncate()
I need to count element occurrences for distributions
coll.frequencies()
I need to select specific fields from node properties
map.submap()
API & Data Exchange
I need to serialize graph data to JSON for external APIs
json.toJson()
json.fromJsonMap()
I need to filter sensitive data before transmission
map.removeKeys()
I need to build formatted messages with variable substitution
text.format()
Permission & Flag Management
I need to check and set permission flags in bitmasks
bitwise.and()
bitwise.or()
I need to toggle flags without conditional logic
bitwise.xor()

Integration Patterns for GraphRAG and LLM Pipelines

Graph databases increasingly support retrieval-augmented generation (RAG) architectures where LLMs query structured knowledge graphs. FalkorDB positions itself for GraphRAG workloads through OpenCypher query language support and graph traversal optimizations.

UDFs extend GraphRAG implementations by encoding domain-specific retrieval logic. Text similarity functions enable semantic matching between LLM-generated queries and graph entity labels. Date manipulation functions filter temporal contexts relevant to user prompts. JSON serialization functions format graph query results for LLM consumption via structured prompts.

Wrapping up

UDF Implementation Checklist
UDF Implementation Guide
1
Write JavaScript Function
Create your custom logic in JavaScript. Functions can accept scalars, nodes, edges, paths, or collections as parameters.
function UpperCaseOdd(s) {
  return s.split('').map((c, i) =>
    i % 2 !== 0 ? c.toUpperCase() : c
  ).join('');
}
2
Register with falkor.register()
Expose your function to Cypher queries using the falkor object. Multiple functions can be registered in a single library.
falkor.register('UpperCaseOdd', UpperCaseOdd);
3
Load Library via GRAPH.UDF LOAD
Use the Python client or direct Redis command to load your JavaScript code into FalkorDB.
db.udf_load('mylib', js_code)
# or use: GRAPH.UDF LOAD mylib "code" REPLACE
4
Test in Cypher Query
Call your function in Cypher using the library namespace. Access node properties and methods like getNeighbors().
MATCH (n:Person)
RETURN mylib.UpperCaseOdd(n.name)
5
Handle Data Types Correctly
Access node/edge properties via .attributes, traverse with .getNeighbors(), and use path.nodes for path operations.
Node:id, labels, attributes, getNeighbors()
Edge:id, type, startNode, endNode, attributes
Path:nodes, length, relationships
6
Respect Read-Only Constraints
UDFs cannot modify graph structure. Separate read operations using UDFs from write operations using CREATE, MERGE, SET, DELETE.
✗ Cannot:Create nodes, add edges, update properties, delete entities
✓ Can:Read properties, traverse relationships, compute values, return results

References and citations